Magister en Inteligencia Artificial¶
Sistemas de Recomendacion¶
Equipo Docente¶
- Profesor: Vicente Dominguez
- Profesor: Michelle Anastasia Veroska Madrid Urrutia
Practico Clase 6: Deep Learning para Recomendacion de Imagenes¶
Detalles de la Actividad¶
| Aspecto | Descripcion |
|---|---|
| Tema | Recomendacion basada en similitud visual usando CNNs |
| Arquitecturas | VGG16 y VGG19 |
| Dataset | Fashion Images |
| Formato de entrega | Notebook |
| Fecha de entrega | 30 de Noviembre 2025 |
Alumno¶
Javier Augusto Rebull Saucedo¶
Programa: Magister en Inteligencia Artificial - Pontificia Universidad Catolica de Chile
Estudiante Invitado | MNA Tecnologico de Monterrey
Objetivo del Practico¶
Implementar un sistema de recomendacion de productos de moda basado exclusivamente en similitud visual utilizando:
- Extraccion de caracteristicas mediante redes neuronales convolucionales pre-entrenadas
- Comparacion de arquitecturas: VGG16 vs VGG19
- Similitud coseno para recuperar productos visualmente similares
📖 Introducción¶
Los sistemas de recomendación basados en contenido visual han ganado relevancia en aplicaciones de e-commerce de moda, donde la apariencia del producto es fundamental para la decisión de compra. A diferencia de los sistemas tradicionales que dependen de metadata o comportamiento del usuario, este enfoque utiliza Deep Learning para extraer características visuales directamente de las imágenes.
En este práctico, implementaremos un sistema de recomendación que:
- Utiliza redes convolucionales pre-entrenadas (VGG16 y VGG19) como extractores de features
- Calcula similitud coseno entre vectores de características
- Recomienda productos visualmente similares basándose únicamente en la imagen
Este enfoque es particularmente útil cuando no se dispone de suficiente información de usuarios o cuando la similitud visual es el criterio principal de recomendación.
Práctico Deep Learning para Recomendación¶
Diplomado Machine Learning Aplicado, PUC Chile
Profesor: Vicente Domínguez
Alumno: Javier Augusto Rebull Saucedo
En esta actividad trabajaremos con un recomendador de ropa basado netamente en las imagenes más similares extrayendo features con redes neuronales convolucionales.
Importación de Librerías¶
# --- Deep Learning: Modelos pre-entrenados ---
from keras.applications import vgg16, vgg19, ResNet50 # Arquitecturas CNN pre-entrenadas en ImageNet
from keras.models import Model # Clase para definir modelos personalizados
from keras.applications.imagenet_utils import preprocess_input # Preprocesamiento estandar para CNNs
# --- Deep Learning: Utilidades de imagen ---
from tensorflow.keras.utils import load_img, img_to_array # Carga y conversion de imagenes a arrays
# --- Procesamiento de imagenes ---
from PIL import Image # Manipulacion basica de imagenes (Python Imaging Library)
# --- Computacion numerica y datos ---
import numpy as np # Operaciones con arrays y matrices
import pandas as pd # Manipulacion y analisis de datos tabulares
# --- Machine Learning: Metricas ---
from sklearn.metrics.pairwise import cosine_similarity # Calculo de similitud coseno entre vectores
# --- Visualizacion ---
import matplotlib.pyplot as plt # Generacion de graficos y visualizaciones
# --- Utilidades del sistema ---
import os # Interaccion con el sistema de archivos
# --Extras--
import seaborn as sns
En esta sección se trabajará con modelos pre-entrenados de redes convolucionales (CNN) que extraen caracteristicas visuales de las imagenes.

Para los curiosos se recomienda revisar los siguientes links:
descarga de imagenes¶
%%capture
!gdown 1iLOeNZw69iyYXa7QS5ZutN7ACUpkbL3x
!unzip images_fashion.zip
!mkdir images
!mv *.png images/
Ejercicio con vgg19¶
cargamos la CNN pre-entrenada en ImageNet¶
# Cargamos el modelo VGG19 pre-entrenado en ImageNet
vgg19_model = vgg19.VGG19(weights='imagenet')
# Quitar la capa de clasificacion (usamos fc2 como extractor de features)
feat_extractor = Model(inputs=vgg19_model.input, outputs=vgg19_model.get_layer("fc2").output)
# Vemos resumen de la arquitectura del modelo
print("Modelo: VGG19")
print("="*60)
feat_extractor.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg19/vgg19_weights_tf_dim_ordering_tf_kernels.h5 574710816/574710816 ━━━━━━━━━━━━━━━━━━━━ 2s 0us/step Modelo: VGG19 ============================================================
Model: "functional"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ input_layer (InputLayer) │ (None, 224, 224, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_conv1 (Conv2D) │ (None, 224, 224, 64) │ 1,792 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_conv2 (Conv2D) │ (None, 224, 224, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_pool (MaxPooling2D) │ (None, 112, 112, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_conv1 (Conv2D) │ (None, 112, 112, 128) │ 73,856 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_conv2 (Conv2D) │ (None, 112, 112, 128) │ 147,584 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_pool (MaxPooling2D) │ (None, 56, 56, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv1 (Conv2D) │ (None, 56, 56, 256) │ 295,168 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv2 (Conv2D) │ (None, 56, 56, 256) │ 590,080 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv3 (Conv2D) │ (None, 56, 56, 256) │ 590,080 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv4 (Conv2D) │ (None, 56, 56, 256) │ 590,080 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_pool (MaxPooling2D) │ (None, 28, 28, 256) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv1 (Conv2D) │ (None, 28, 28, 512) │ 1,180,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv2 (Conv2D) │ (None, 28, 28, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv3 (Conv2D) │ (None, 28, 28, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv4 (Conv2D) │ (None, 28, 28, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_pool (MaxPooling2D) │ (None, 14, 14, 512) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv1 (Conv2D) │ (None, 14, 14, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv2 (Conv2D) │ (None, 14, 14, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv3 (Conv2D) │ (None, 14, 14, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv4 (Conv2D) │ (None, 14, 14, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_pool (MaxPooling2D) │ (None, 7, 7, 512) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 25088) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ fc1 (Dense) │ (None, 4096) │ 102,764,544 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ fc2 (Dense) │ (None, 4096) │ 16,781,312 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 139,570,240 (532.42 MB)
Trainable params: 139,570,240 (532.42 MB)
Non-trainable params: 0 (0.00 B)
procesamiento de imágenes para dárselas como input a la CNN¶
ls
images/ images_fashion.zip __MACOSX/ sample_data/ style.csv style.txt
imgs_path = "images/" # ruta
imgs_model_width, imgs_model_height = 224, 224 # tamaño de las imagenes 224x224 pixeles
nb_closest_images = 5 # cantidad de imagenes similares a recomendar
files = [imgs_path + x for x in os.listdir(imgs_path) if "png" in x]
print("total de imagenes:",len(files))
total de imagenes: 2184
galería de imagenes¶
# ============================================================================
# GALERIA DE 40 IMAGENES ALEATORIAS PARA SELECCION
# ============================================================================
import random
random.seed(42) # Cambiar seed para ver diferentes imagenes
sample_indices = random.sample(range(len(files)), 40)
fig, axes = plt.subplots(5, 8, figsize=(20, 12), facecolor='#f8f9fa')
for i, ax in enumerate(axes.flat):
idx = sample_indices[i]
img = load_img(files[idx], target_size=(imgs_model_width, imgs_model_height))
ax.imshow(img)
ax.set_title(f'idx: {idx}', fontsize=9, fontweight='bold', color='#2c3e50')
ax.axis('off')
plt.suptitle('GALERIA DE IMAGENES - Selecciona un indice para buscar similares',
fontsize=16, fontweight='bold', color='#2c3e50', y=1.01)
plt.tight_layout()
plt.show()
print("\nIndices disponibles en esta muestra:")
print(sample_indices)
print(f"\nTotal de imagenes en el dataset: {len(files)}")
Indices disponibles en esta muestra: [456, 102, 1126, 1003, 914, 571, 419, 356, 1728, 130, 122, 383, 895, 952, 2069, 108, 814, 1718, 902, 1839, 1139, 26, 653, 1731, 1393, 1138, 636, 881, 1378, 418, 379, 1556, 396, 1470, 1408, 1083, 177, 1881, 511, 1550] Total de imagenes en el dataset: 2184
# Imagen seleccionada manualmente
idx = 1881 # Indice fijo seleccionado de la galeria
original = load_img(files[idx], target_size=(imgs_model_width, imgs_model_height))
plt.imshow(original)
plt.show()
print(f"Indice: {idx}")
print("Imagen cargada exitosamente!")
Indice: 1881 Imagen cargada exitosamente!
# convertir PIL image a numpy array
numpy_image = img_to_array(original)
# convertir imagen a batch de imagenes para entrenamiento mas eficiente
image_batch = np.expand_dims(numpy_image, axis=0)
# preparamos la imagen para la VGG16
processed_image = preprocess_input(image_batch.copy())
print(f"Imagen procesada: idx = {idx}")
print(f"Archivo: {files[idx]}")
print(f"Image batch size: {image_batch.shape}")
processed_image
Imagen procesada: idx = 1881 Archivo: images/3_2_045.png Image batch size: (1, 224, 224, 3)
array([[[[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
...,
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ]],
[[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
...,
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ]],
[[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
...,
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ],
[126.061, 113.221, 106.32 ]],
...,
[[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
...,
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ]],
[[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
...,
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ]],
[[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
...,
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ],
[127.061, 114.221, 107.32 ]]]], dtype=float32)
Visualización del Preprocesamiento de Imágenes¶
A continuación se muestra el proceso de transformación que sufre una imagen antes de ser procesada por la CNN. Esta visualización incluye la descomposición en canales RGB, histogramas de distribución de píxeles, mapas de intensidad y una vista matricial de los valores numéricos que componen la imagen.
# ============================================================================
# VISUALIZACIÓN DETALLADA DE LA TRANSFORMACIÓN
# ============================================================================
fig = plt.figure(figsize=(18, 12))
# 1. Imagen Original
ax1 = fig.add_subplot(3, 4, 1)
ax1.imshow(original)
ax1.set_title('Imagen Original', fontsize=12, fontweight='bold')
ax1.axis('off')
# 2-4. Canales RGB separados
channel_names = ['Canal Rojo (R)', 'Canal Verde (G)', 'Canal Azul (B)']
cmaps = ['Reds', 'Greens', 'Blues']
for i in range(3):
ax = fig.add_subplot(3, 4, i+2)
im = ax.imshow(numpy_image[:,:,i], cmap=cmaps[i])
ax.set_title(channel_names[i], fontsize=11, fontweight='bold')
ax.axis('off')
plt.colorbar(im, ax=ax, fraction=0.046, pad=0.04)
# 5. Histograma de valores de píxeles (antes del preprocesamiento)
ax5 = fig.add_subplot(3, 4, 5)
colors = ['red', 'green', 'blue']
for i, color in enumerate(colors):
ax5.hist(numpy_image[:,:,i].flatten(), bins=50, alpha=0.5, color=color, label=f'Canal {color[0].upper()}')
ax5.set_title('Histograma RGB (Original)', fontsize=12, fontweight='bold')
ax5.set_xlabel('Valor de Pixel (0-255)')
ax5.set_ylabel('Frecuencia')
ax5.legend()
ax5.grid(True, alpha=0.3)
# 6. Imagen preprocesada (normalizada para visualización)
ax6 = fig.add_subplot(3, 4, 6)
# Normalizar para visualización (el preprocesamiento puede dar valores negativos)
processed_vis = processed_image[0] - processed_image[0].min()
processed_vis = (processed_vis / processed_vis.max() * 255).astype(np.uint8)
ax6.imshow(processed_vis)
ax6.set_title('Imagen Preprocesada\n(Normalizada para visualizacion)', fontsize=11, fontweight='bold')
ax6.axis('off')
# 7. Histograma después del preprocesamiento
ax7 = fig.add_subplot(3, 4, 7)
for i, color in enumerate(colors):
ax7.hist(processed_image[0,:,:,i].flatten(), bins=50, alpha=0.5, color=color, label=f'Canal {color[0].upper()}')
ax7.set_title('Histograma RGB (Preprocesado)', fontsize=12, fontweight='bold')
ax7.set_xlabel('Valor de Pixel (centrado)')
ax7.set_ylabel('Frecuencia')
ax7.legend()
ax7.grid(True, alpha=0.3)
# 8. Heatmap de intensidad promedio
ax8 = fig.add_subplot(3, 4, 8)
intensity = np.mean(numpy_image, axis=2)
im8 = ax8.imshow(intensity, cmap='hot')
ax8.set_title('Mapa de Intensidad', fontsize=12, fontweight='bold')
ax8.axis('off')
plt.colorbar(im8, ax=ax8, fraction=0.046, pad=0.04)
# 9-11. Vista de matriz numérica (esquina superior izquierda 8x8)
for i in range(3):
ax = fig.add_subplot(3, 4, 9+i)
patch = numpy_image[:8, :8, i]
im = ax.imshow(patch, cmap=cmaps[i], vmin=0, vmax=255)
ax.set_title(f'Matriz {["R","G","B"][i]} (8x8 pixeles)', fontsize=10, fontweight='bold')
# Añadir valores numéricos
for y in range(8):
for x in range(8):
ax.text(x, y, f'{int(patch[y,x])}', ha='center', va='center',
fontsize=6, color='white' if patch[y,x] < 128 else 'black')
ax.set_xticks([])
ax.set_yticks([])
# 12. Estadísticas
ax12 = fig.add_subplot(3, 4, 12)
ax12.axis('off')
stats_text = f"""
ESTADISTICAS DE LA IMAGEN
Shape Original: {numpy_image.shape}
Shape Batch: {image_batch.shape}
Shape Procesada: {processed_image.shape}
VALORES ORIGINALES (0-255):
R: mean={numpy_image[:,:,0].mean():.1f}, std={numpy_image[:,:,0].std():.1f}
G: mean={numpy_image[:,:,1].mean():.1f}, std={numpy_image[:,:,1].std():.1f}
B: mean={numpy_image[:,:,2].mean():.1f}, std={numpy_image[:,:,2].std():.1f}
VALORES PREPROCESADOS:
R: mean={processed_image[0,:,:,0].mean():.1f}, std={processed_image[0,:,:,0].std():.1f}
G: mean={processed_image[0,:,:,1].mean():.1f}, std={processed_image[0,:,:,1].std():.1f}
B: mean={processed_image[0,:,:,2].mean():.1f}, std={processed_image[0,:,:,2].std():.1f}
Total pixeles: {numpy_image.shape[0] * numpy_image.shape[1]:,}
"""
ax12.text(0.1, 0.95, stats_text, transform=ax12.transAxes, fontsize=10,
verticalalignment='top', fontfamily='monospace',
bbox=dict(boxstyle='round', facecolor='#f0f0f0', alpha=0.8))
plt.suptitle('ANALISIS DETALLADO DE TRANSFORMACION DE IMAGEN PARA CNN',
fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()
print("\nImagen convertida y preprocesada exitosamente para la CNN!")
Imagen convertida y preprocesada exitosamente para la CNN!
Extraccion de Features con VGG19 (Imagen Individual)¶
# obtenemos los features (embeddings) de las imagenes pasandolas por la VGG19
img_features = feat_extractor.predict(processed_image)
print("features successfully extracted!")
print("number of image features:",img_features.size)
img_features
1/1 ━━━━━━━━━━━━━━━━━━━━ 3s 3s/step features successfully extracted! number of image features: 4096
array([[0. , 0. , 1.282129 , ..., 0. , 0. ,
3.2349215]], dtype=float32)
img_features.shape
(1, 4096)
# repetimos el mismo proceso para todas las imagenes y guardamos los batch en una lista para entregarselos procesados a la VGG19
importedImages = []
for f in files:
filename = f
original = load_img(filename, target_size=(224, 224))
numpy_image = img_to_array(original)
image_batch = np.expand_dims(numpy_image, axis=0)
importedImages.append(image_batch)
images = np.vstack(importedImages)
processed_imgs = preprocess_input(images.copy())
# obtenemos los features para cada imagen con la CNN
imgs_features = feat_extractor.predict(processed_imgs)
print("features extraidos exitosamente!")
imgs_features.shape
69/69 ━━━━━━━━━━━━━━━━━━━━ 28s 233ms/step features extraidos exitosamente!
(2184, 4096)
# computa similaridad coseno entre los features de las imagenes
cosSimilarities = cosine_similarity(imgs_features)
# guardamos los resultados en un dataframe
cos_similarities_df = pd.DataFrame(cosSimilarities, columns=files, index=files)
cos_similarities_df #.head()
| images/4_9_037.png | images/6_7_002.png | images/0_0_065.png | images/2_2_016.png | images/3_0_051.png | images/4_6_073.png | images/6_4_014.png | images/1_4_023.png | images/4_1_031.png | images/0_2_008.png | ... | images/4_9_001.png | images/1_7_016.png | images/1_5_030.png | images/0_0_050.png | images/6_2_051.png | images/5_0_041.png | images/2_1_024.png | images/6_2_036.png | images/1_0_045.png | images/1_6_028.png | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| images/4_9_037.png | 1.000000 | 0.234604 | 0.270189 | 0.249941 | 0.255349 | 0.680604 | 0.349751 | 0.518958 | 0.241923 | 0.277481 | ... | 0.351203 | 0.430215 | 0.195535 | 0.333183 | 0.391848 | 0.224119 | 0.262152 | 0.234611 | 0.259536 | 0.518413 |
| images/6_7_002.png | 0.234604 | 1.000000 | 0.511521 | 0.383287 | 0.462207 | 0.206403 | 0.289353 | 0.319520 | 0.146800 | 0.325533 | ... | 0.213976 | 0.380928 | 0.247740 | 0.395100 | 0.259345 | 0.623821 | 0.177394 | 0.275122 | 0.477618 | 0.438690 |
| images/0_0_065.png | 0.270189 | 0.511521 | 1.000000 | 0.334260 | 0.428377 | 0.245892 | 0.404932 | 0.334245 | 0.148620 | 0.290298 | ... | 0.207807 | 0.354473 | 0.149409 | 0.627904 | 0.215543 | 0.551125 | 0.156673 | 0.254642 | 0.574104 | 0.470782 |
| images/2_2_016.png | 0.249941 | 0.383287 | 0.334260 | 1.000000 | 0.282075 | 0.236036 | 0.381667 | 0.353902 | 0.113987 | 0.393667 | ... | 0.294710 | 0.285929 | 0.297188 | 0.299190 | 0.449039 | 0.308903 | 0.175529 | 0.516861 | 0.260724 | 0.333543 |
| images/3_0_051.png | 0.255349 | 0.462207 | 0.428377 | 0.282075 | 1.000000 | 0.230695 | 0.282458 | 0.370668 | 0.177571 | 0.308622 | ... | 0.256329 | 0.337162 | 0.219169 | 0.492444 | 0.306134 | 0.366985 | 0.283139 | 0.305349 | 0.574615 | 0.477951 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| images/5_0_041.png | 0.224119 | 0.623821 | 0.551125 | 0.308903 | 0.366985 | 0.223562 | 0.258447 | 0.245701 | 0.145548 | 0.340220 | ... | 0.174923 | 0.386429 | 0.190107 | 0.397802 | 0.247668 | 1.000000 | 0.237673 | 0.234311 | 0.402411 | 0.414470 |
| images/2_1_024.png | 0.262152 | 0.177394 | 0.156673 | 0.175529 | 0.283139 | 0.217404 | 0.197698 | 0.252936 | 0.313078 | 0.255856 | ... | 0.388821 | 0.236676 | 0.313779 | 0.170812 | 0.303209 | 0.237673 | 1.000000 | 0.160127 | 0.186704 | 0.348633 |
| images/6_2_036.png | 0.234611 | 0.275122 | 0.254642 | 0.516861 | 0.305349 | 0.256257 | 0.418394 | 0.341404 | 0.130103 | 0.325769 | ... | 0.411298 | 0.284604 | 0.178414 | 0.379742 | 0.590484 | 0.234311 | 0.160127 | 1.000000 | 0.235376 | 0.232773 |
| images/1_0_045.png | 0.259536 | 0.477618 | 0.574104 | 0.260724 | 0.574615 | 0.271201 | 0.390164 | 0.399710 | 0.178353 | 0.262325 | ... | 0.220495 | 0.390571 | 0.158452 | 0.443267 | 0.238740 | 0.402411 | 0.186704 | 0.235376 | 1.000000 | 0.440751 |
| images/1_6_028.png | 0.518413 | 0.438690 | 0.470782 | 0.333543 | 0.477951 | 0.340128 | 0.419501 | 0.471212 | 0.166863 | 0.323689 | ... | 0.436489 | 0.613510 | 0.273013 | 0.434295 | 0.357210 | 0.414470 | 0.348633 | 0.232773 | 0.440751 | 1.000000 |
2184 rows × 2184 columns
# ============================================================================
# VISUALIZACION DE LA MATRIZ DE SIMILITUD COSENO
# ============================================================================
print(f"Imagen de referencia: idx = {idx}")
print(f"Archivo: {files[idx]}")
print("="*60 + "\n")
fig = plt.figure(figsize=(18, 14))
# 1. Heatmap de la matriz de similitud completa
ax1 = fig.add_subplot(2, 2, 1)
# Tomamos una muestra para que sea visible (cada N imagenes)
sample_step = max(1, len(files) // 50)
sample_indices = list(range(0, len(files), sample_step))
similarity_sample = cosSimilarities[np.ix_(sample_indices, sample_indices)]
im1 = ax1.imshow(similarity_sample, cmap='viridis', aspect='auto')
ax1.set_title('Heatmap de Similitud Coseno\n(Muestra de imagenes)', fontsize=14, fontweight='bold')
ax1.set_xlabel('Indice de Imagen', fontsize=11)
ax1.set_ylabel('Indice de Imagen', fontsize=11)
cbar1 = plt.colorbar(im1, ax=ax1, fraction=0.046, pad=0.04)
cbar1.set_label('Similitud Coseno', fontsize=10)
# 2. Distribucion de similitudes (histograma)
ax2 = fig.add_subplot(2, 2, 2)
# Extraer triangulo superior (sin diagonal) para evitar duplicados
upper_triangle = cosSimilarities[np.triu_indices_from(cosSimilarities, k=1)]
ax2.hist(upper_triangle, bins=80, color='#3498db', alpha=0.7, edgecolor='black', linewidth=0.5)
ax2.axvline(upper_triangle.mean(), color='red', linestyle='--', linewidth=2, label=f'Media: {upper_triangle.mean():.4f}')
ax2.axvline(np.median(upper_triangle), color='orange', linestyle='--', linewidth=2, label=f'Mediana: {np.median(upper_triangle):.4f}')
ax2.set_title('Distribucion de Similitudes entre Imagenes', fontsize=14, fontweight='bold')
ax2.set_xlabel('Score de Similitud Coseno', fontsize=11)
ax2.set_ylabel('Frecuencia', fontsize=11)
ax2.legend(fontsize=10)
ax2.grid(True, alpha=0.3)
# 3. Top 10 pares mas similares (excluyendo la misma imagen)
ax3 = fig.add_subplot(2, 2, 3)
# Encontrar los pares mas similares
n_images = len(files)
pairs = []
for i in range(n_images):
for j in range(i+1, n_images):
pairs.append((i, j, cosSimilarities[i, j]))
# Ordenar por similitud descendente
pairs_sorted = sorted(pairs, key=lambda x: x[2], reverse=True)[:15]
# Graficar barras horizontales
pair_labels = [f'Img {p[0]} - Img {p[1]}' for p in pairs_sorted]
pair_scores = [p[2] for p in pairs_sorted]
colors_gradient = plt.cm.RdYlGn(np.linspace(0.9, 0.5, len(pair_scores)))
bars = ax3.barh(range(len(pair_labels)), pair_scores, color=colors_gradient, edgecolor='black', linewidth=0.5)
ax3.set_yticks(range(len(pair_labels)))
ax3.set_yticklabels(pair_labels, fontsize=9)
ax3.set_xlabel('Similitud Coseno', fontsize=11)
ax3.set_title('Top 15 Pares de Imagenes Mas Similares', fontsize=14, fontweight='bold')
ax3.invert_yaxis()
ax3.grid(True, axis='x', alpha=0.3)
# Añadir valores en las barras
for bar, score in zip(bars, pair_scores):
ax3.text(bar.get_width() + 0.005, bar.get_y() + bar.get_height()/2,
f'{score:.4f}', va='center', fontsize=8, fontweight='bold')
# 4. Estadisticas y boxplot
ax4 = fig.add_subplot(2, 2, 4)
# Boxplot con violin plot combinado
parts = ax4.violinplot([upper_triangle], positions=[1], showmeans=True, showmedians=True)
parts['bodies'][0].set_facecolor('#3498db')
parts['bodies'][0].set_alpha(0.7)
# Boxplot superpuesto
bp = ax4.boxplot([upper_triangle], positions=[1], widths=0.15, patch_artist=True)
bp['boxes'][0].set_facecolor('#2ecc71')
bp['boxes'][0].set_alpha(0.7)
ax4.set_ylabel('Similitud Coseno', fontsize=11)
ax4.set_title('Distribucion de Similitudes\n(Violin + Box Plot)', fontsize=14, fontweight='bold')
ax4.set_xticks([1])
ax4.set_xticklabels(['Todas las\ncomparaciones'], fontsize=11)
ax4.grid(True, axis='y', alpha=0.3)
# Añadir estadisticas como texto
stats_text = f"""
ESTADISTICAS GLOBALES
---------------------
Total imagenes: {n_images:,}
Total comparaciones: {len(upper_triangle):,}
Similitud minima: {upper_triangle.min():.4f}
Similitud maxima: {upper_triangle.max():.4f}
Media: {upper_triangle.mean():.4f}
Mediana: {np.median(upper_triangle):.4f}
Desv. Estandar: {upper_triangle.std():.4f}
Percentil 25: {np.percentile(upper_triangle, 25):.4f}
Percentil 75: {np.percentile(upper_triangle, 75):.4f}
Percentil 95: {np.percentile(upper_triangle, 95):.4f}
"""
ax4.text(1.6, 0.5, stats_text, transform=ax4.get_xaxis_transform(), fontsize=9,
verticalalignment='center', fontfamily='monospace',
bbox=dict(boxstyle='round', facecolor='#f8f9fa', alpha=0.9, edgecolor='gray'))
ax4.set_xlim(0.5, 2.8)
plt.suptitle('ANALISIS DE SIMILITUD COSENO ENTRE IMAGENES',
fontsize=18, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()
print(f"\nMatriz de similitud calculada exitosamente!")
print(f"Dimensiones: {cosSimilarities.shape[0]} x {cosSimilarities.shape[1]} = {cosSimilarities.shape[0]**2:,} comparaciones")
Imagen de referencia: idx = 1881 Archivo: images/3_2_045.png ============================================================
Matriz de similitud calculada exitosamente! Dimensiones: 2184 x 2184 = 4,769,856 comparaciones
# esta funcion recupera las imagenes más similares dada una imagen entregada por el usuario
def retrieve_most_similar_products(given_img):
print("-----------------------------------------------------------------------")
print("producto escogido:")
original = load_img(given_img, target_size=(imgs_model_width, imgs_model_height))
plt.imshow(original)
plt.show()
print("-----------------------------------------------------------------------")
print("productos más similares:")
closest_imgs = cos_similarities_df[given_img].sort_values(ascending=False)[1:nb_closest_images+1].index
closest_imgs_scores = cos_similarities_df[given_img].sort_values(ascending=False)[1:nb_closest_images+1]
for i in range(0,len(closest_imgs)):
original = load_img(closest_imgs[i], target_size=(imgs_model_width, imgs_model_height))
plt.imshow(original)
plt.show()
print("score de similaridad : ",closest_imgs_scores[i])
# ============================================================================
# VISUALIZACION DE RECOMENDACIONES CON ANALISIS DE FEATURES
# ============================================================================
query_image = files[idx]
print(f"Indice seleccionado: {idx}")
print(f"Archivo: {query_image}")
# Obtener las imagenes mas similares y sus scores
closest_imgs = cos_similarities_df[query_image].sort_values(ascending=False)[1:nb_closest_images+1]
closest_imgs_paths = closest_imgs.index.tolist()
closest_imgs_scores = closest_imgs.values
# Obtener indices para extraer features
query_idx = files.index(query_image)
rec_indices = [files.index(p) for p in closest_imgs_paths]
# Colores para ranking
ranking_colors = ['#27ae60', '#2ecc71', '#f1c40f', '#e67e22', '#e74c3c']
# ============================================================================
# FIGURA 1: Imagen de Consulta
# ============================================================================
fig1, ax1 = plt.subplots(figsize=(6, 6), facecolor='#f8f9fa')
query_img = load_img(query_image, target_size=(imgs_model_width, imgs_model_height))
ax1.imshow(query_img)
ax1.set_title('IMAGEN DE CONSULTA', fontsize=18, fontweight='bold', color='#2c3e50', pad=20)
ax1.axis('off')
# Borde elegante
for spine in ax1.spines.values():
spine.set_visible(True)
spine.set_color('#3498db')
spine.set_linewidth(5)
plt.tight_layout()
plt.show()
# ============================================================================
# FIGURA 2: Top 5 Recomendaciones
# ============================================================================
print("\n" + "="*70)
print("TOP 5 PRODUCTOS RECOMENDADOS")
print("="*70 + "\n")
fig2, axes2 = plt.subplots(1, nb_closest_images, figsize=(18, 4), facecolor='#f8f9fa')
for i in range(nb_closest_images):
ax = axes2[i]
rec_img = load_img(closest_imgs_paths[i], target_size=(imgs_model_width, imgs_model_height))
ax.imshow(rec_img)
ax.axis('off')
score = closest_imgs_scores[i]
ax.set_title(f'#{i+1} | Similitud: {score:.4f}', fontsize=13, fontweight='bold',
color=ranking_colors[i], pad=15)
# Borde con color de ranking
for spine in ax.spines.values():
spine.set_visible(True)
spine.set_color(ranking_colors[i])
spine.set_linewidth(4)
plt.tight_layout()
plt.show()
# ============================================================================
# FIGURA 3: Comparacion de Features (Barras)
# ============================================================================
print("\n" + "="*70)
print("ANALISIS COMPARATIVO DE FEATURES")
print("="*70 + "\n")
fig3, ax3 = plt.subplots(figsize=(12, 5), facecolor='#f8f9fa')
bars = ax3.barh(range(nb_closest_images), closest_imgs_scores,
color=ranking_colors, edgecolor='#2c3e50', height=0.6, linewidth=1.5)
ax3.set_yticks(range(nb_closest_images))
ax3.set_yticklabels([f'Recomendacion #{i+1}' for i in range(nb_closest_images)], fontsize=12, fontweight='bold')
ax3.set_xlabel('Score de Similitud Coseno', fontsize=13, fontweight='bold')
ax3.set_title('RANKING DE SIMILITUD', fontsize=16, fontweight='bold', color='#2c3e50', pad=20)
ax3.set_xlim(0, 1)
ax3.invert_yaxis()
ax3.set_facecolor('#fafafa')
ax3.grid(True, axis='x', alpha=0.4, linestyle='--')
ax3.spines['top'].set_visible(False)
ax3.spines['right'].set_visible(False)
for bar, score in zip(bars, closest_imgs_scores):
ax3.text(score + 0.02, bar.get_y() + bar.get_height()/2,
f'{score:.4f}', va='center', fontsize=12, fontweight='bold')
plt.tight_layout()
plt.show()
# ============================================================================
# FIGURA 4: Comparacion de Vectores de Features
# ============================================================================
fig4, axes4 = plt.subplots(2, 3, figsize=(16, 10), facecolor='#f8f9fa')
# Features de la imagen query
query_features = imgs_features[query_idx]
# Plot 1: Distribucion de features de la imagen query
ax = axes4[0, 0]
ax.hist(query_features, bins=50, color='#3498db', alpha=0.7, edgecolor='black', linewidth=0.5)
ax.set_title('Distribucion de Features\n(Imagen Query)', fontsize=12, fontweight='bold')
ax.set_xlabel('Valor del Feature')
ax.set_ylabel('Frecuencia')
ax.axvline(query_features.mean(), color='red', linestyle='--', linewidth=2, label=f'Media: {query_features.mean():.2f}')
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)
# Plot 2: Primeros 100 features comparados (Query vs Top 1)
print("\n")
ax = axes4[0, 1]
feature_range = range(100)
ax.plot(feature_range, query_features[:100], 'b-', alpha=0.7, linewidth=1.5, label='Query')
ax.plot(feature_range, imgs_features[rec_indices[0]][:100], 'g--', alpha=0.7, linewidth=1.5, label='Top #1')
ax.set_title('Comparacion de Features (0-100)\nQuery vs Top #1', fontsize=12, fontweight='bold')
ax.set_xlabel('Indice del Feature')
ax.set_ylabel('Valor')
ax.legend(fontsize=10)
ax.grid(True, alpha=0.3)
# Plot 3: Heatmap de correlacion entre query y recomendaciones
print("\n")
ax = axes4[0, 2]
correlation_matrix = np.zeros((6, 6))
all_features = [query_features] + [imgs_features[idx] for idx in rec_indices]
labels = ['Query'] + [f'#{i+1}' for i in range(nb_closest_images)]
for i in range(6):
for j in range(6):
correlation_matrix[i, j] = cosine_similarity([all_features[i]], [all_features[j]])[0][0]
im = ax.imshow(correlation_matrix, cmap='RdYlGn', vmin=0.5, vmax=1)
ax.set_xticks(range(6))
ax.set_yticks(range(6))
ax.set_xticklabels(labels, fontsize=10, fontweight='bold')
ax.set_yticklabels(labels, fontsize=10, fontweight='bold')
ax.set_title('Matriz de Similitud\nQuery vs Recomendaciones', fontsize=12, fontweight='bold')
plt.colorbar(im, ax=ax, fraction=0.046, pad=0.04)
# Añadir valores en el heatmap
for i in range(6):
for j in range(6):
ax.text(j, i, f'{correlation_matrix[i,j]:.2f}', ha='center', va='center',
fontsize=9, fontweight='bold', color='white' if correlation_matrix[i,j] > 0.75 else 'black')
# Plot 4: Activaciones mas altas (Top features)
ax = axes4[1, 0]
top_k = 20
top_indices = np.argsort(query_features)[-top_k:][::-1]
top_values = query_features[top_indices]
colors_bar = plt.cm.Blues(np.linspace(0.4, 0.9, top_k))
ax.barh(range(top_k), top_values, color=colors_bar, edgecolor='black', linewidth=0.5)
ax.set_yticks(range(top_k))
ax.set_yticklabels([f'F-{idx}' for idx in top_indices], fontsize=8)
ax.set_xlabel('Valor de Activacion')
ax.set_title(f'Top {top_k} Features con Mayor Activacion\n(Imagen Query)', fontsize=12, fontweight='bold')
ax.invert_yaxis()
ax.grid(True, axis='x', alpha=0.3)
# Plot 5: Diferencia de features entre Query y cada recomendacion
ax = axes4[1, 1]
differences = []
for i, rec_idx in enumerate(rec_indices):
diff = np.mean(np.abs(query_features - imgs_features[rec_idx]))
differences.append(diff)
bars = ax.bar(range(nb_closest_images), differences, color=ranking_colors, edgecolor='black', linewidth=1.5)
ax.set_xticks(range(nb_closest_images))
ax.set_xticklabels([f'#{i+1}' for i in range(nb_closest_images)], fontsize=11, fontweight='bold')
ax.set_ylabel('Diferencia Promedio Absoluta')
ax.set_title('Diferencia de Features\nQuery vs Recomendaciones', fontsize=12, fontweight='bold')
ax.grid(True, axis='y', alpha=0.3)
# Corregido: posicionar texto justo encima de cada barra
for i, (bar, diff) in enumerate(zip(bars, differences)):
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{diff:.2f}', ha='center', va='bottom', fontsize=10, fontweight='bold')
# Plot 6: Estadisticas resumen
ax = axes4[1, 2]
ax.axis('off')
stats_text = f"""
ESTADISTICAS DE FEATURES
{'='*40}
IMAGEN QUERY:
- Dimension del vector: {len(query_features):,}
- Media: {query_features.mean():.4f}
- Std: {query_features.std():.4f}
- Min: {query_features.min():.4f}
- Max: {query_features.max():.4f}
- Features activos (>0): {np.sum(query_features > 0):,}
SIMILITUDES:
- Mejor match: #{1} ({closest_imgs_scores[0]:.4f})
- Peor match: #{nb_closest_images} ({closest_imgs_scores[-1]:.4f})
- Rango: {closest_imgs_scores[0] - closest_imgs_scores[-1]:.4f}
DIFERENCIAS PROMEDIO:
- Menor diferencia: #{np.argmin(differences)+1} ({min(differences):.2f})
- Mayor diferencia: #{np.argmax(differences)+1} ({max(differences):.2f})
"""
ax.text(0.1, 0.95, stats_text, transform=ax.transAxes, fontsize=11,
verticalalignment='top', fontfamily='monospace',
bbox=dict(boxstyle='round', facecolor='#e8f4f8', alpha=0.9, edgecolor='#3498db', linewidth=2))
plt.suptitle('ANALISIS DETALLADO DE FEATURES CNN', fontsize=18, fontweight='bold', color='#2c3e50', y=1.02)
plt.tight_layout()
plt.show()
# ============================================================================
# RESUMEN TABULAR
# ============================================================================
print("\n")
print("+" + "="*62 + "+")
print("|{:^62}|".format("RESUMEN DE RECOMENDACIONES"))
print("+" + "="*62 + "+")
print("| {:<10} | {:<15} | {:<30} |".format("Ranking", "Similitud", "Archivo"))
print("+" + "-"*62 + "+")
for i, (path, score) in enumerate(zip(closest_imgs_paths, closest_imgs_scores)):
filename = path.split('/')[-1]
print("| {:<10} | {:<15.4f} | {:<30} |".format(f"#{i+1}", score, filename))
print("+" + "="*62 + "+")
Indice seleccionado: 1881 Archivo: images/3_2_045.png
====================================================================== TOP 5 PRODUCTOS RECOMENDADOS ======================================================================
====================================================================== ANALISIS COMPARATIVO DE FEATURES ======================================================================
+==============================================================+ | RESUMEN DE RECOMENDACIONES | +==============================================================+ | Ranking | Similitud | Archivo | +--------------------------------------------------------------+ | #1 | 0.6318 | 3_2_066.png | | #2 | 0.6237 | 4_2_030.png | | #3 | 0.5985 | 4_2_029.png | | #4 | 0.5978 | 0_2_006.png | | #5 | 0.5943 | 5_2_059.png | +==============================================================+
Repetimos el Ejericio con vgg16¶
cargamos la CNN pre-entrenada en ImageNet¶
# Cargamos el modelo VGG16 pre-entrenado en ImageNet
vgg16_model = vgg16.VGG16(weights='imagenet')
# Quitar la capa de clasificacion (usamos fc2 como extractor de features)
feat_extractor_vgg16 = Model(inputs=vgg16_model.input, outputs=vgg16_model.get_layer("fc2").output)
# Vemos resumen de la arquitectura del modelo
print("Modelo: VGG16")
print("="*60)
feat_extractor_vgg16.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5 553467096/553467096 ━━━━━━━━━━━━━━━━━━━━ 2s 0us/step Modelo: VGG16 ============================================================
Model: "functional_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ input_layer_1 (InputLayer) │ (None, 224, 224, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_conv1 (Conv2D) │ (None, 224, 224, 64) │ 1,792 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_conv2 (Conv2D) │ (None, 224, 224, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_pool (MaxPooling2D) │ (None, 112, 112, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_conv1 (Conv2D) │ (None, 112, 112, 128) │ 73,856 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_conv2 (Conv2D) │ (None, 112, 112, 128) │ 147,584 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_pool (MaxPooling2D) │ (None, 56, 56, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv1 (Conv2D) │ (None, 56, 56, 256) │ 295,168 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv2 (Conv2D) │ (None, 56, 56, 256) │ 590,080 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv3 (Conv2D) │ (None, 56, 56, 256) │ 590,080 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_pool (MaxPooling2D) │ (None, 28, 28, 256) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv1 (Conv2D) │ (None, 28, 28, 512) │ 1,180,160 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv2 (Conv2D) │ (None, 28, 28, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv3 (Conv2D) │ (None, 28, 28, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_pool (MaxPooling2D) │ (None, 14, 14, 512) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv1 (Conv2D) │ (None, 14, 14, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv2 (Conv2D) │ (None, 14, 14, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv3 (Conv2D) │ (None, 14, 14, 512) │ 2,359,808 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_pool (MaxPooling2D) │ (None, 7, 7, 512) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 25088) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ fc1 (Dense) │ (None, 4096) │ 102,764,544 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ fc2 (Dense) │ (None, 4096) │ 16,781,312 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 134,260,544 (512.16 MB)
Trainable params: 134,260,544 (512.16 MB)
Non-trainable params: 0 (0.00 B)
Extraccion de Features con VGG16 (Imagen Individual)¶
# obtenemos los features (embeddings) de las imagenes pasandolas por la VGG16
img_features_vgg16 = feat_extractor_vgg16.predict(processed_image)
print("Modelo: VGG16")
print("Features successfully extracted!")
print("Number of image features:", img_features_vgg16.size)
img_features_vgg16
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 609ms/step Modelo: VGG16 Features successfully extracted! Number of image features: 4096
array([[0. , 0.06179556, 0. , ..., 0. , 0. ,
0.38277382]], dtype=float32)
img_features_vgg16.shape
(1, 4096)
# repetimos el mismo proceso para todas las imagenes y guardamos los batch en una lista para entregarselos procesados a la VGG16
importedImages = []
for f in files:
filename = f
original = load_img(filename, target_size=(224, 224))
numpy_image = img_to_array(original)
image_batch = np.expand_dims(numpy_image, axis=0)
importedImages.append(image_batch)
images = np.vstack(importedImages)
processed_imgs = preprocess_input(images.copy())
# obtenemos los features para cada imagen con la CNN VGG16
imgs_features_vgg16 = feat_extractor_vgg16.predict(processed_imgs)
print("Modelo: VGG16")
print("Features extraidos exitosamente!")
imgs_features_vgg16.shape
69/69 ━━━━━━━━━━━━━━━━━━━━ 11s 152ms/step Modelo: VGG16 Features extraidos exitosamente!
(2184, 4096)
# computa similaridad coseno entre los features de las imagenes (VGG16)
cosSimilarities_vgg16 = cosine_similarity(imgs_features_vgg16)
# guardamos los resultados en un dataframe
cos_similarities_df_vgg16 = pd.DataFrame(cosSimilarities_vgg16, columns=files, index=files)
print("Modelo: VGG16")
print(f"Matriz de similitud: {cosSimilarities_vgg16.shape}")
cos_similarities_df_vgg16
Modelo: VGG16 Matriz de similitud: (2184, 2184)
| images/4_9_037.png | images/6_7_002.png | images/0_0_065.png | images/2_2_016.png | images/3_0_051.png | images/4_6_073.png | images/6_4_014.png | images/1_4_023.png | images/4_1_031.png | images/0_2_008.png | ... | images/4_9_001.png | images/1_7_016.png | images/1_5_030.png | images/0_0_050.png | images/6_2_051.png | images/5_0_041.png | images/2_1_024.png | images/6_2_036.png | images/1_0_045.png | images/1_6_028.png | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| images/4_9_037.png | 1.000000 | 0.246062 | 0.283663 | 0.266740 | 0.241140 | 0.594259 | 0.313979 | 0.483542 | 0.239732 | 0.318703 | ... | 0.396407 | 0.393539 | 0.213051 | 0.245248 | 0.428944 | 0.282033 | 0.237320 | 0.278771 | 0.318151 | 0.486188 |
| images/6_7_002.png | 0.246062 | 1.000000 | 0.550363 | 0.347476 | 0.451177 | 0.216592 | 0.325017 | 0.351115 | 0.173656 | 0.358070 | ... | 0.284720 | 0.429984 | 0.215832 | 0.346894 | 0.304129 | 0.532368 | 0.097576 | 0.306875 | 0.576793 | 0.355692 |
| images/0_0_065.png | 0.283663 | 0.550363 | 1.000000 | 0.410117 | 0.451497 | 0.263972 | 0.418978 | 0.353457 | 0.150228 | 0.429977 | ... | 0.260815 | 0.400474 | 0.195367 | 0.653091 | 0.311516 | 0.648227 | 0.109908 | 0.365587 | 0.605920 | 0.426587 |
| images/2_2_016.png | 0.266740 | 0.347476 | 0.410117 | 1.000000 | 0.316359 | 0.275954 | 0.434996 | 0.361416 | 0.134597 | 0.459089 | ... | 0.321647 | 0.380279 | 0.265330 | 0.311073 | 0.471006 | 0.331293 | 0.130500 | 0.546710 | 0.356725 | 0.317658 |
| images/3_0_051.png | 0.241140 | 0.451177 | 0.451497 | 0.316359 | 1.000000 | 0.197331 | 0.286894 | 0.356369 | 0.191200 | 0.362173 | ... | 0.276441 | 0.350337 | 0.176082 | 0.462562 | 0.366559 | 0.322122 | 0.170922 | 0.349192 | 0.551198 | 0.399418 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| images/5_0_041.png | 0.282033 | 0.532368 | 0.648227 | 0.331293 | 0.322122 | 0.240706 | 0.356089 | 0.300686 | 0.142373 | 0.408893 | ... | 0.207521 | 0.381667 | 0.196213 | 0.459438 | 0.303401 | 1.000000 | 0.115555 | 0.286341 | 0.487723 | 0.315371 |
| images/2_1_024.png | 0.237320 | 0.097576 | 0.109908 | 0.130500 | 0.170922 | 0.164226 | 0.162832 | 0.206901 | 0.331468 | 0.188610 | ... | 0.425571 | 0.177083 | 0.226012 | 0.129678 | 0.234193 | 0.115555 | 1.000000 | 0.154881 | 0.144900 | 0.201714 |
| images/6_2_036.png | 0.278771 | 0.306875 | 0.365587 | 0.546710 | 0.349192 | 0.283427 | 0.584111 | 0.398095 | 0.149442 | 0.389695 | ... | 0.371073 | 0.321338 | 0.282947 | 0.405400 | 0.653835 | 0.286341 | 0.154881 | 1.000000 | 0.328660 | 0.238560 |
| images/1_0_045.png | 0.318151 | 0.576793 | 0.605920 | 0.356725 | 0.551198 | 0.295753 | 0.457204 | 0.468110 | 0.132301 | 0.363992 | ... | 0.366497 | 0.552081 | 0.196607 | 0.514968 | 0.337822 | 0.487723 | 0.144900 | 0.328660 | 1.000000 | 0.487877 |
| images/1_6_028.png | 0.486188 | 0.355692 | 0.426587 | 0.317658 | 0.399418 | 0.395290 | 0.358418 | 0.441232 | 0.154469 | 0.349450 | ... | 0.416565 | 0.513014 | 0.237698 | 0.404568 | 0.310631 | 0.315371 | 0.201714 | 0.238560 | 0.487877 | 1.000000 |
2184 rows × 2184 columns
# ============================================================================
# VISUALIZACION DE LA MATRIZ DE SIMILITUD COSENO - VGG16
# ============================================================================
print("Modelo: VGG16")
print(f"Imagen de referencia: idx = {idx}")
print(f"Archivo: {files[idx]}")
print("="*60 + "\n")
fig = plt.figure(figsize=(18, 14))
# 1. Heatmap de la matriz de similitud completa
ax1 = fig.add_subplot(2, 2, 1)
# Tomamos una muestra para que sea visible (cada N imagenes)
sample_step = max(1, len(files) // 50)
sample_indices = list(range(0, len(files), sample_step))
similarity_sample = cosSimilarities_vgg16[np.ix_(sample_indices, sample_indices)]
im1 = ax1.imshow(similarity_sample, cmap='viridis', aspect='auto')
ax1.set_title('Heatmap de Similitud Coseno\n(Muestra de imagenes)', fontsize=14, fontweight='bold')
ax1.set_xlabel('Indice de Imagen', fontsize=11)
ax1.set_ylabel('Indice de Imagen', fontsize=11)
cbar1 = plt.colorbar(im1, ax=ax1, fraction=0.046, pad=0.04)
cbar1.set_label('Similitud Coseno', fontsize=10)
# 2. Distribucion de similitudes (histograma)
ax2 = fig.add_subplot(2, 2, 2)
# Extraer triangulo superior (sin diagonal) para evitar duplicados
upper_triangle = cosSimilarities_vgg16[np.triu_indices_from(cosSimilarities_vgg16, k=1)]
ax2.hist(upper_triangle, bins=80, color='#4CAF50', alpha=0.7, edgecolor='black', linewidth=0.5)
ax2.axvline(upper_triangle.mean(), color='red', linestyle='--', linewidth=2, label=f'Media: {upper_triangle.mean():.4f}')
ax2.axvline(np.median(upper_triangle), color='orange', linestyle='--', linewidth=2, label=f'Mediana: {np.median(upper_triangle):.4f}')
ax2.set_title('Distribucion de Similitudes entre Imagenes', fontsize=14, fontweight='bold')
ax2.set_xlabel('Score de Similitud Coseno', fontsize=11)
ax2.set_ylabel('Frecuencia', fontsize=11)
ax2.legend(fontsize=10)
ax2.grid(True, alpha=0.3)
# 3. Top 15 pares mas similares (excluyendo la misma imagen)
ax3 = fig.add_subplot(2, 2, 3)
# Encontrar los pares mas similares
n_images = len(files)
pairs = []
for i in range(n_images):
for j in range(i+1, n_images):
pairs.append((i, j, cosSimilarities_vgg16[i, j]))
# Ordenar por similitud descendente
pairs_sorted = sorted(pairs, key=lambda x: x[2], reverse=True)[:15]
# Graficar barras horizontales
pair_labels = [f'Img {p[0]} - Img {p[1]}' for p in pairs_sorted]
pair_scores = [p[2] for p in pairs_sorted]
colors_gradient = plt.cm.RdYlGn(np.linspace(0.9, 0.5, len(pair_scores)))
bars = ax3.barh(range(len(pair_labels)), pair_scores, color=colors_gradient, edgecolor='black', linewidth=0.5)
ax3.set_yticks(range(len(pair_labels)))
ax3.set_yticklabels(pair_labels, fontsize=9)
ax3.set_xlabel('Similitud Coseno', fontsize=11)
ax3.set_title('Top 15 Pares de Imagenes Mas Similares', fontsize=14, fontweight='bold')
ax3.invert_yaxis()
ax3.grid(True, axis='x', alpha=0.3)
# Añadir valores en las barras
for bar, score in zip(bars, pair_scores):
ax3.text(bar.get_width() + 0.005, bar.get_y() + bar.get_height()/2,
f'{score:.4f}', va='center', fontsize=8, fontweight='bold')
# 4. Estadisticas y boxplot
ax4 = fig.add_subplot(2, 2, 4)
# Boxplot con violin plot combinado
parts = ax4.violinplot([upper_triangle], positions=[1], showmeans=True, showmedians=True)
parts['bodies'][0].set_facecolor('#4CAF50')
parts['bodies'][0].set_alpha(0.7)
# Boxplot superpuesto
bp = ax4.boxplot([upper_triangle], positions=[1], widths=0.15, patch_artist=True)
bp['boxes'][0].set_facecolor('#2ecc71')
bp['boxes'][0].set_alpha(0.7)
ax4.set_ylabel('Similitud Coseno', fontsize=11)
ax4.set_title('Distribucion de Similitudes\n(Violin + Box Plot)', fontsize=14, fontweight='bold')
ax4.set_xticks([1])
ax4.set_xticklabels(['Todas las\ncomparaciones'], fontsize=11)
ax4.grid(True, axis='y', alpha=0.3)
# Añadir estadisticas como texto
stats_text = f"""
ESTADISTICAS GLOBALES (VGG16)
-----------------------------
Total imagenes: {n_images:,}
Total comparaciones: {len(upper_triangle):,}
Similitud minima: {upper_triangle.min():.4f}
Similitud maxima: {upper_triangle.max():.4f}
Media: {upper_triangle.mean():.4f}
Mediana: {np.median(upper_triangle):.4f}
Desv. Estandar: {upper_triangle.std():.4f}
Percentil 25: {np.percentile(upper_triangle, 25):.4f}
Percentil 75: {np.percentile(upper_triangle, 75):.4f}
Percentil 95: {np.percentile(upper_triangle, 95):.4f}
"""
ax4.text(1.6, 0.5, stats_text, transform=ax4.get_xaxis_transform(), fontsize=9,
verticalalignment='center', fontfamily='monospace',
bbox=dict(boxstyle='round', facecolor='#f8f9fa', alpha=0.9, edgecolor='gray'))
ax4.set_xlim(0.5, 2.8)
plt.suptitle('ANALISIS DE SIMILITUD COSENO ENTRE IMAGENES - VGG16',
fontsize=18, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()
print(f"\nMatriz de similitud VGG16 calculada exitosamente!")
print(f"Dimensiones: {cosSimilarities_vgg16.shape[0]} x {cosSimilarities_vgg16.shape[1]} = {cosSimilarities_vgg16.shape[0]**2:,} comparaciones")
Modelo: VGG16 Imagen de referencia: idx = 1881 Archivo: images/3_2_045.png ============================================================
Matriz de similitud VGG16 calculada exitosamente! Dimensiones: 2184 x 2184 = 4,769,856 comparaciones
# esta funcion recupera las imagenes mas similares dada una imagen entregada por el usuario (VGG16)
def retrieve_most_similar_products_vgg16(given_img):
print("-----------------------------------------------------------------------")
print("Modelo: VGG16")
print("Producto escogido:")
original = load_img(given_img, target_size=(imgs_model_width, imgs_model_height))
plt.imshow(original)
plt.show()
print("-----------------------------------------------------------------------")
print("Productos mas similares:")
closest_imgs = cos_similarities_df_vgg16[given_img].sort_values(ascending=False)[1:nb_closest_images+1].index
closest_imgs_scores = cos_similarities_df_vgg16[given_img].sort_values(ascending=False)[1:nb_closest_images+1]
for i in range(0, len(closest_imgs)):
original = load_img(closest_imgs[i], target_size=(imgs_model_width, imgs_model_height))
plt.imshow(original)
plt.show()
print("Score de similaridad:", closest_imgs_scores[i])
# ============================================================================
# VISUALIZACION DE RECOMENDACIONES CON ANALISIS DE FEATURES - VGG16
# ============================================================================
query_image = files[idx]
print("Modelo: VGG16")
print(f"Indice seleccionado: {idx}")
print(f"Archivo: {query_image}")
# Obtener las imagenes mas similares y sus scores
closest_imgs = cos_similarities_df_vgg16[query_image].sort_values(ascending=False)[1:nb_closest_images+1]
closest_imgs_paths = closest_imgs.index.tolist()
closest_imgs_scores = closest_imgs.values
# Obtener indices para extraer features
query_idx = files.index(query_image)
rec_indices = [files.index(p) for p in closest_imgs_paths]
# Colores para ranking
ranking_colors = ['#27ae60', '#2ecc71', '#f1c40f', '#e67e22', '#e74c3c']
# ============================================================================
# FIGURA 1: Imagen de Consulta
# ============================================================================
fig1, ax1 = plt.subplots(figsize=(6, 6), facecolor='#f8f9fa')
query_img = load_img(query_image, target_size=(imgs_model_width, imgs_model_height))
ax1.imshow(query_img)
ax1.set_title('IMAGEN DE CONSULTA - VGG16', fontsize=18, fontweight='bold', color='#2c3e50', pad=20)
ax1.axis('off')
# Borde elegante (verde para VGG16)
for spine in ax1.spines.values():
spine.set_visible(True)
spine.set_color('#4CAF50')
spine.set_linewidth(5)
plt.tight_layout()
plt.show()
# ============================================================================
# FIGURA 2: Top 5 Recomendaciones
# ============================================================================
print("\n" + "="*70)
print("TOP 5 PRODUCTOS RECOMENDADOS - VGG16")
print("="*70 + "\n")
fig2, axes2 = plt.subplots(1, nb_closest_images, figsize=(18, 4), facecolor='#f8f9fa')
for i in range(nb_closest_images):
ax = axes2[i]
rec_img = load_img(closest_imgs_paths[i], target_size=(imgs_model_width, imgs_model_height))
ax.imshow(rec_img)
ax.axis('off')
score = closest_imgs_scores[i]
ax.set_title(f'#{i+1} | Similitud: {score:.4f}', fontsize=13, fontweight='bold',
color=ranking_colors[i], pad=15)
# Borde con color de ranking
for spine in ax.spines.values():
spine.set_visible(True)
spine.set_color(ranking_colors[i])
spine.set_linewidth(4)
plt.tight_layout()
plt.show()
# ============================================================================
# FIGURA 3: Comparacion de Features (Barras)
# ============================================================================
print("\n" + "="*70)
print("ANALISIS COMPARATIVO DE FEATURES - VGG16")
print("="*70 + "\n")
fig3, ax3 = plt.subplots(figsize=(12, 5), facecolor='#f8f9fa')
bars = ax3.barh(range(nb_closest_images), closest_imgs_scores,
color=ranking_colors, edgecolor='#2c3e50', height=0.6, linewidth=1.5)
ax3.set_yticks(range(nb_closest_images))
ax3.set_yticklabels([f'Recomendacion #{i+1}' for i in range(nb_closest_images)], fontsize=12, fontweight='bold')
ax3.set_xlabel('Score de Similitud Coseno', fontsize=13, fontweight='bold')
ax3.set_title('RANKING DE SIMILITUD - VGG16', fontsize=16, fontweight='bold', color='#2c3e50', pad=20)
ax3.set_xlim(0, 1)
ax3.invert_yaxis()
ax3.set_facecolor('#fafafa')
ax3.grid(True, axis='x', alpha=0.4, linestyle='--')
ax3.spines['top'].set_visible(False)
ax3.spines['right'].set_visible(False)
for bar, score in zip(bars, closest_imgs_scores):
ax3.text(score + 0.02, bar.get_y() + bar.get_height()/2,
f'{score:.4f}', va='center', fontsize=12, fontweight='bold')
plt.tight_layout()
plt.show()
# ============================================================================
# FIGURA 4: Comparacion de Vectores de Features
# ============================================================================
fig4, axes4 = plt.subplots(2, 3, figsize=(16, 10), facecolor='#f8f9fa')
# Features de la imagen query
query_features = imgs_features_vgg16[query_idx]
# Plot 1: Distribucion de features de la imagen query
ax = axes4[0, 0]
ax.hist(query_features, bins=50, color='#4CAF50', alpha=0.7, edgecolor='black', linewidth=0.5)
ax.set_title('Distribucion de Features\n(Imagen Query - VGG16)', fontsize=12, fontweight='bold')
ax.set_xlabel('Valor del Feature')
ax.set_ylabel('Frecuencia')
ax.axvline(query_features.mean(), color='red', linestyle='--', linewidth=2, label=f'Media: {query_features.mean():.2f}')
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)
# Plot 2: Primeros 100 features comparados (Query vs Top 1)
ax = axes4[0, 1]
feature_range = range(100)
ax.plot(feature_range, query_features[:100], 'g-', alpha=0.7, linewidth=1.5, label='Query')
ax.plot(feature_range, imgs_features_vgg16[rec_indices[0]][:100], 'b--', alpha=0.7, linewidth=1.5, label='Top #1')
ax.set_title('Comparacion de Features (0-100)\nQuery vs Top #1', fontsize=12, fontweight='bold')
ax.set_xlabel('Indice del Feature')
ax.set_ylabel('Valor')
ax.legend(fontsize=10)
ax.grid(True, alpha=0.3)
# Plot 3: Heatmap de correlacion entre query y recomendaciones
ax = axes4[0, 2]
correlation_matrix = np.zeros((6, 6))
all_features = [query_features] + [imgs_features_vgg16[rec_idx] for rec_idx in rec_indices]
labels = ['Query'] + [f'#{i+1}' for i in range(nb_closest_images)]
for i in range(6):
for j in range(6):
correlation_matrix[i, j] = cosine_similarity([all_features[i]], [all_features[j]])[0][0]
im = ax.imshow(correlation_matrix, cmap='RdYlGn', vmin=0.5, vmax=1)
ax.set_xticks(range(6))
ax.set_yticks(range(6))
ax.set_xticklabels(labels, fontsize=10, fontweight='bold')
ax.set_yticklabels(labels, fontsize=10, fontweight='bold')
ax.set_title('Matriz de Similitud\nQuery vs Recomendaciones', fontsize=12, fontweight='bold')
plt.colorbar(im, ax=ax, fraction=0.046, pad=0.04)
# Añadir valores en el heatmap
for i in range(6):
for j in range(6):
ax.text(j, i, f'{correlation_matrix[i,j]:.2f}', ha='center', va='center',
fontsize=9, fontweight='bold', color='white' if correlation_matrix[i,j] > 0.75 else 'black')
# Plot 4: Activaciones mas altas (Top features)
ax = axes4[1, 0]
top_k = 20
top_indices = np.argsort(query_features)[-top_k:][::-1]
top_values = query_features[top_indices]
colors_bar = plt.cm.Greens(np.linspace(0.4, 0.9, top_k))
ax.barh(range(top_k), top_values, color=colors_bar, edgecolor='black', linewidth=0.5)
ax.set_yticks(range(top_k))
ax.set_yticklabels([f'F-{idx}' for idx in top_indices], fontsize=8)
ax.set_xlabel('Valor de Activacion')
ax.set_title(f'Top {top_k} Features con Mayor Activacion\n(Imagen Query - VGG16)', fontsize=12, fontweight='bold')
ax.invert_yaxis()
ax.grid(True, axis='x', alpha=0.3)
# Plot 5: Diferencia de features entre Query y cada recomendacion
ax = axes4[1, 1]
differences = []
for i, rec_idx in enumerate(rec_indices):
diff = np.mean(np.abs(query_features - imgs_features_vgg16[rec_idx]))
differences.append(diff)
bars = ax.bar(range(nb_closest_images), differences, color=ranking_colors, edgecolor='black', linewidth=1.5)
ax.set_xticks(range(nb_closest_images))
ax.set_xticklabels([f'#{i+1}' for i in range(nb_closest_images)], fontsize=11, fontweight='bold')
ax.set_ylabel('Diferencia Promedio Absoluta')
ax.set_title('Diferencia de Features\nQuery vs Recomendaciones', fontsize=12, fontweight='bold')
ax.grid(True, axis='y', alpha=0.3)
# Posicionar texto justo encima de cada barra
for i, (bar, diff) in enumerate(zip(bars, differences)):
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{diff:.2f}', ha='center', va='bottom', fontsize=10, fontweight='bold')
# Plot 6: Estadisticas resumen
ax = axes4[1, 2]
ax.axis('off')
stats_text = f"""
ESTADISTICAS DE FEATURES (VGG16)
{'='*40}
IMAGEN QUERY:
- Dimension del vector: {len(query_features):,}
- Media: {query_features.mean():.4f}
- Std: {query_features.std():.4f}
- Min: {query_features.min():.4f}
- Max: {query_features.max():.4f}
- Features activos (>0): {np.sum(query_features > 0):,}
SIMILITUDES:
- Mejor match: #{1} ({closest_imgs_scores[0]:.4f})
- Peor match: #{nb_closest_images} ({closest_imgs_scores[-1]:.4f})
- Rango: {closest_imgs_scores[0] - closest_imgs_scores[-1]:.4f}
DIFERENCIAS PROMEDIO:
- Menor diferencia: #{np.argmin(differences)+1} ({min(differences):.2f})
- Mayor diferencia: #{np.argmax(differences)+1} ({max(differences):.2f})
"""
ax.text(0.1, 0.95, stats_text, transform=ax.transAxes, fontsize=11,
verticalalignment='top', fontfamily='monospace',
bbox=dict(boxstyle='round', facecolor='#e8f8e8', alpha=0.9, edgecolor='#4CAF50', linewidth=2))
plt.suptitle('ANALISIS DETALLADO DE FEATURES CNN - VGG16', fontsize=18, fontweight='bold', color='#2c3e50', y=1.02)
plt.tight_layout()
plt.show()
# ============================================================================
# RESUMEN TABULAR
# ============================================================================
print("\n")
print("+" + "="*62 + "+")
print("|{:^62}|".format("RESUMEN DE RECOMENDACIONES - VGG16"))
print("+" + "="*62 + "+")
print("| {:<10} | {:<15} | {:<30} |".format("Ranking", "Similitud", "Archivo"))
print("+" + "-"*62 + "+")
for i, (path, score) in enumerate(zip(closest_imgs_paths, closest_imgs_scores)):
filename = path.split('/')[-1]
print("| {:<10} | {:<15.4f} | {:<30} |".format(f"#{i+1}", score, filename))
print("+" + "="*62 + "+")
Modelo: VGG16 Indice seleccionado: 1881 Archivo: images/3_2_045.png
====================================================================== TOP 5 PRODUCTOS RECOMENDADOS - VGG16 ======================================================================
====================================================================== ANALISIS COMPARATIVO DE FEATURES - VGG16 ======================================================================
+==============================================================+ | RESUMEN DE RECOMENDACIONES - VGG16 | +==============================================================+ | Ranking | Similitud | Archivo | +--------------------------------------------------------------+ | #1 | 0.6578 | 5_2_059.png | | #2 | 0.6397 | 3_2_066.png | | #3 | 0.6396 | 6_2_033.png | | #4 | 0.6349 | 3_2_043.png | | #5 | 0.6058 | 5_2_044.png | +==============================================================+
# ============================================================================
# COMPARACION FINAL: VGG16 vs VGG19
# ============================================================================
print("="*70)
print("COMPARACION DE MODELOS: VGG16 vs VGG19")
print("="*70)
print(f"\nImagen de consulta: idx = {idx}")
print(f"Archivo: {files[idx]}\n")
# Obtener recomendaciones de ambos modelos
query_image = files[idx]
# VGG16
closest_vgg16 = cos_similarities_df_vgg16[query_image].sort_values(ascending=False)[1:nb_closest_images+1]
paths_vgg16 = closest_vgg16.index.tolist()
scores_vgg16 = closest_vgg16.values
# VGG19
closest_vgg19 = cos_similarities_df[query_image].sort_values(ascending=False)[1:nb_closest_images+1]
paths_vgg19 = closest_vgg19.index.tolist()
scores_vgg19 = closest_vgg19.values
# ============================================================================
# FIGURA 1: Imagen Query + Recomendaciones lado a lado
# ============================================================================
fig1 = plt.figure(figsize=(20, 14), facecolor='white')
# Titulo principal
fig1.suptitle('COMPARACION DE RECOMENDACIONES',
fontsize=24, fontweight='bold', color='#2c3e50', y=0.97)
fig1.text(0.5, 0.93, 'VGG16 vs VGG19', fontsize=18, color='#7f8c8d', ha='center', style='italic')
# --- IMAGEN DE CONSULTA ---
ax_query = fig1.add_axes([0.35, 0.72, 0.30, 0.18])
query_img = load_img(query_image, target_size=(imgs_model_width, imgs_model_height))
ax_query.imshow(query_img)
ax_query.set_title('IMAGEN DE CONSULTA', fontsize=16, fontweight='bold', color='#2c3e50', pad=15)
ax_query.axis('off')
for spine in ax_query.spines.values():
spine.set_visible(True)
spine.set_color('#9b59b6')
spine.set_linewidth(5)
# --- SECCION VGG16 ---
# Fondo verde claro para VGG16
ax_bg_vgg16 = fig1.add_axes([0.01, 0.36, 0.98, 0.32])
ax_bg_vgg16.set_facecolor('#e8f5e9')
ax_bg_vgg16.axis('off')
# Etiqueta VGG16
fig1.text(0.02, 0.65, 'VGG16', fontsize=22, fontweight='bold', color='white', ha='left',
bbox=dict(boxstyle='round,pad=0.4', facecolor='#4CAF50', edgecolor='#2e7d32', linewidth=3))
# Recomendaciones VGG16
for i in range(nb_closest_images):
ax = fig1.add_axes([0.02 + i*0.19, 0.38, 0.17, 0.26])
rec_img = load_img(paths_vgg16[i], target_size=(imgs_model_width, imgs_model_height))
ax.imshow(rec_img)
ax.set_title(f'#{i+1} | Score: {scores_vgg16[i]:.4f}', fontsize=12, fontweight='bold', color='#2e7d32', pad=8)
ax.axis('off')
for spine in ax.spines.values():
spine.set_visible(True)
spine.set_color('#4CAF50')
spine.set_linewidth(3)
# --- LINEA DIVISORIA ---
fig1.add_axes([0.05, 0.34, 0.90, 0.005]).set_facecolor('#bdc3c7')
fig1.axes[-1].axis('off')
# --- SECCION VGG19 ---
# Fondo azul claro para VGG19
ax_bg_vgg19 = fig1.add_axes([0.01, 0.02, 0.98, 0.32])
ax_bg_vgg19.set_facecolor('#e3f2fd')
ax_bg_vgg19.axis('off')
# Etiqueta VGG19
fig1.text(0.02, 0.31, 'VGG19', fontsize=22, fontweight='bold', color='white', ha='left',
bbox=dict(boxstyle='round,pad=0.4', facecolor='#2196F3', edgecolor='#1565c0', linewidth=3))
# Recomendaciones VGG19
for i in range(nb_closest_images):
ax = fig1.add_axes([0.02 + i*0.19, 0.04, 0.17, 0.26])
rec_img = load_img(paths_vgg19[i], target_size=(imgs_model_width, imgs_model_height))
ax.imshow(rec_img)
ax.set_title(f'#{i+1} | Score: {scores_vgg19[i]:.4f}', fontsize=12, fontweight='bold', color='#1565c0', pad=8)
ax.axis('off')
for spine in ax.spines.values():
spine.set_visible(True)
spine.set_color('#2196F3')
spine.set_linewidth(3)
plt.show()
# ============================================================================
# FIGURA 2: Comparacion de Scores
# ============================================================================
fig2, axes2 = plt.subplots(1, 3, figsize=(18, 5), facecolor='#f8f9fa')
# Plot 1: Barras comparativas de scores
ax = axes2[0]
x = np.arange(nb_closest_images)
width = 0.35
bars1 = ax.bar(x - width/2, scores_vgg16, width, label='VGG16', color='#4CAF50', edgecolor='black', linewidth=1)
bars2 = ax.bar(x + width/2, scores_vgg19, width, label='VGG19', color='#2196F3', edgecolor='black', linewidth=1)
ax.set_xlabel('Ranking de Recomendacion', fontsize=12, fontweight='bold')
ax.set_ylabel('Score de Similitud', fontsize=12, fontweight='bold')
ax.set_title('Comparacion de Scores por Ranking', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels([f'#{i+1}' for i in range(nb_closest_images)], fontsize=11, fontweight='bold')
ax.legend(fontsize=11)
ax.grid(True, axis='y', alpha=0.3)
ax.set_ylim(0, 1)
# Añadir valores
for bar in bars1:
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{bar.get_height():.3f}', ha='center', va='bottom', fontsize=8, fontweight='bold')
for bar in bars2:
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{bar.get_height():.3f}', ha='center', va='bottom', fontsize=8, fontweight='bold')
# Plot 2: Diferencia de scores
ax = axes2[1]
diff_scores = scores_vgg16 - scores_vgg19
colors_diff = ['#4CAF50' if d > 0 else '#2196F3' for d in diff_scores]
bars = ax.bar(range(nb_closest_images), diff_scores, color=colors_diff, edgecolor='black', linewidth=1.5)
ax.axhline(y=0, color='black', linestyle='-', linewidth=1)
ax.set_xlabel('Ranking de Recomendacion', fontsize=12, fontweight='bold')
ax.set_ylabel('Diferencia (VGG16 - VGG19)', fontsize=12, fontweight='bold')
ax.set_title('Diferencia de Scores entre Modelos', fontsize=14, fontweight='bold')
ax.set_xticks(range(nb_closest_images))
ax.set_xticklabels([f'#{i+1}' for i in range(nb_closest_images)], fontsize=11, fontweight='bold')
ax.grid(True, axis='y', alpha=0.3)
for i, (bar, diff) in enumerate(zip(bars, diff_scores)):
y_pos = bar.get_height() + 0.005 if diff >= 0 else bar.get_height() - 0.015
ax.text(bar.get_x() + bar.get_width()/2, y_pos,
f'{diff:.4f}', ha='center', va='bottom' if diff >= 0 else 'top',
fontsize=9, fontweight='bold')
# Plot 3: Distribucion global de similitudes
ax = axes2[2]
upper_vgg16 = cosSimilarities_vgg16[np.triu_indices_from(cosSimilarities_vgg16, k=1)]
upper_vgg19 = cosSimilarities[np.triu_indices_from(cosSimilarities, k=1)]
ax.hist(upper_vgg16, bins=50, alpha=0.6, color='#4CAF50', label=f'VGG16 (mean={upper_vgg16.mean():.4f})', edgecolor='black', linewidth=0.5)
ax.hist(upper_vgg19, bins=50, alpha=0.6, color='#2196F3', label=f'VGG19 (mean={upper_vgg19.mean():.4f})', edgecolor='black', linewidth=0.5)
ax.axvline(upper_vgg16.mean(), color='#2e7d32', linestyle='--', linewidth=2)
ax.axvline(upper_vgg19.mean(), color='#1565c0', linestyle='--', linewidth=2)
ax.set_xlabel('Score de Similitud Coseno', fontsize=12, fontweight='bold')
ax.set_ylabel('Frecuencia', fontsize=12, fontweight='bold')
ax.set_title('Distribucion Global de Similitudes', fontsize=14, fontweight='bold')
ax.legend(fontsize=10)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# ============================================================================
# FIGURA 3: Comparacion de Parametros y Arquitectura
# ============================================================================
fig3, axes3 = plt.subplots(1, 2, figsize=(14, 5), facecolor='#f8f9fa')
# Plot 1: Parametros
ax = axes3[0]
params_vgg16 = feat_extractor_vgg16.count_params()
params_vgg19 = feat_extractor.count_params()
bars = ax.bar(['VGG16', 'VGG19'], [params_vgg16, params_vgg19],
color=['#4CAF50', '#2196F3'], edgecolor='black', linewidth=2)
ax.set_ylabel('Numero de Parametros', fontsize=12, fontweight='bold')
ax.set_title('Comparacion de Parametros Entrenables', fontsize=14, fontweight='bold')
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'{int(x/1e6)}M'))
ax.grid(True, axis='y', alpha=0.3)
for bar in bars:
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1e6,
f'{bar.get_height():,.0f}', ha='center', va='bottom', fontsize=10, fontweight='bold')
# Plot 2: Metricas comparativas
ax = axes3[1]
metrics = ['Media Similitud', 'Std Similitud', 'Score Top-1', 'Score Top-5']
values_vgg16 = [upper_vgg16.mean(), upper_vgg16.std(), scores_vgg16[0], scores_vgg16[-1]]
values_vgg19 = [upper_vgg19.mean(), upper_vgg19.std(), scores_vgg19[0], scores_vgg19[-1]]
x = np.arange(len(metrics))
width = 0.35
bars1 = ax.bar(x - width/2, values_vgg16, width, label='VGG16', color='#4CAF50', edgecolor='black')
bars2 = ax.bar(x + width/2, values_vgg19, width, label='VGG19', color='#2196F3', edgecolor='black')
ax.set_ylabel('Valor', fontsize=12, fontweight='bold')
ax.set_title('Metricas Comparativas', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(metrics, fontsize=10, fontweight='bold')
ax.legend(fontsize=11)
ax.grid(True, axis='y', alpha=0.3)
for bar in bars1:
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{bar.get_height():.4f}', ha='center', va='bottom', fontsize=8, fontweight='bold', rotation=45)
for bar in bars2:
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{bar.get_height():.4f}', ha='center', va='bottom', fontsize=8, fontweight='bold', rotation=45)
plt.tight_layout()
plt.show()
# ============================================================================
# FIGURA 4: Coincidencias en recomendaciones
# ============================================================================
# Verificar cuantas recomendaciones coinciden
coincidencias = set(paths_vgg16) & set(paths_vgg19)
solo_vgg16 = set(paths_vgg16) - set(paths_vgg19)
solo_vgg19 = set(paths_vgg19) - set(paths_vgg16)
print("\n" + "="*70)
print("ANALISIS DE COINCIDENCIAS EN RECOMENDACIONES")
print("="*70)
print(f"\nImagenes recomendadas por AMBOS modelos: {len(coincidencias)}")
print(f"Solo recomendadas por VGG16: {len(solo_vgg16)}")
print(f"Solo recomendadas por VGG19: {len(solo_vgg19)}")
print(f"Porcentaje de coincidencia: {len(coincidencias)/nb_closest_images*100:.1f}%")
if len(coincidencias) > 0:
fig4, axes4 = plt.subplots(1, len(coincidencias), figsize=(4*len(coincidencias), 4), facecolor='#f8f9fa')
if len(coincidencias) == 1:
axes4 = [axes4]
for i, path in enumerate(coincidencias):
ax = axes4[i]
img = load_img(path, target_size=(imgs_model_width, imgs_model_height))
ax.imshow(img)
# Obtener scores de ambos modelos
score_16 = cos_similarities_df_vgg16[query_image][path]
score_19 = cos_similarities_df[query_image][path]
ax.set_title(f'VGG16: {score_16:.4f}\nVGG19: {score_19:.4f}', fontsize=11, fontweight='bold')
ax.axis('off')
for spine in ax.spines.values():
spine.set_visible(True)
spine.set_color('#9b59b6')
spine.set_linewidth(3)
plt.suptitle('IMAGENES RECOMENDADAS POR AMBOS MODELOS', fontsize=16, fontweight='bold', y=1.05)
plt.tight_layout()
plt.show()
# ============================================================================
# TABLA COMPARATIVA FINAL
# ============================================================================
print("\n" + "+"*80)
print("|{:^78}|".format("TABLA COMPARATIVA DE RECOMENDACIONES"))
print("+"*80)
print("|{:^10}|{:^32}|{:^32}|".format("Ranking", "VGG16", "VGG19"))
print("|{:^10}|{:^15}|{:^15}|{:^15}|{:^15}|".format("", "Score", "Archivo", "Score", "Archivo"))
print("+"*80)
for i in range(nb_closest_images):
file16 = paths_vgg16[i].split('/')[-1][:12]
file19 = paths_vgg19[i].split('/')[-1][:12]
match = " *" if paths_vgg16[i] == paths_vgg19[i] else ""
print("|{:^10}|{:^15.4f}|{:^15}|{:^15.4f}|{:^15}|{}".format(
f"#{i+1}", scores_vgg16[i], file16, scores_vgg19[i], file19, match))
print("+"*80)
print("| * = Misma imagen recomendada por ambos modelos" + " "*30 + "|")
print("+"*80)
# ============================================================================
# RESUMEN ESTADISTICO FINAL
# ============================================================================
print("\n" + "="*70)
print("RESUMEN ESTADISTICO FINAL")
print("="*70)
print(f"""
{'Metrica':<35} {'VGG16':>15} {'VGG19':>15}
{'-'*65}
{'Parametros totales':<35} {params_vgg16:>15,} {params_vgg19:>15,}
{'Similitud media global':<35} {upper_vgg16.mean():>15.4f} {upper_vgg19.mean():>15.4f}
{'Similitud std global':<35} {upper_vgg16.std():>15.4f} {upper_vgg19.std():>15.4f}
{'Score Top-1':<35} {scores_vgg16[0]:>15.4f} {scores_vgg19[0]:>15.4f}
{'Score Top-5':<35} {scores_vgg16[-1]:>15.4f} {scores_vgg19[-1]:>15.4f}
{'Score promedio Top-5':<35} {scores_vgg16.mean():>15.4f} {scores_vgg19.mean():>15.4f}
{'Coincidencias en Top-5':<35} {len(coincidencias):>15} {len(coincidencias):>15}
{'-'*65}
""")
# Conclusion
mejor_top1 = "VGG16" if scores_vgg16[0] > scores_vgg19[0] else "VGG19"
mejor_promedio = "VGG16" if scores_vgg16.mean() > scores_vgg19.mean() else "VGG19"
print("CONCLUSION:")
print(f" - Mejor score Top-1: {mejor_top1}")
print(f" - Mejor promedio Top-5: {mejor_promedio}")
print(f" - VGG19 tiene {params_vgg19 - params_vgg16:,} parametros mas ({(params_vgg19-params_vgg16)/params_vgg16*100:.1f}%)")
print("="*70)
====================================================================== COMPARACION DE MODELOS: VGG16 vs VGG19 ====================================================================== Imagen de consulta: idx = 1881 Archivo: images/3_2_045.png
====================================================================== ANALISIS DE COINCIDENCIAS EN RECOMENDACIONES ====================================================================== Imagenes recomendadas por AMBOS modelos: 2 Solo recomendadas por VGG16: 3 Solo recomendadas por VGG19: 3 Porcentaje de coincidencia: 40.0%
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | TABLA COMPARATIVA DE RECOMENDACIONES | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | Ranking | VGG16 | VGG19 | | | Score | Archivo | Score | Archivo | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | #1 | 0.6578 | 5_2_059.png | 0.6318 | 3_2_066.png | | #2 | 0.6397 | 3_2_066.png | 0.6237 | 4_2_030.png | | #3 | 0.6396 | 6_2_033.png | 0.5985 | 4_2_029.png | | #4 | 0.6349 | 3_2_043.png | 0.5978 | 0_2_006.png | | #5 | 0.6058 | 5_2_044.png | 0.5943 | 5_2_059.png | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | * = Misma imagen recomendada por ambos modelos | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ====================================================================== RESUMEN ESTADISTICO FINAL ====================================================================== Metrica VGG16 VGG19 ----------------------------------------------------------------- Parametros totales 134,260,544 139,570,240 Similitud media global 0.3109 0.3019 Similitud std global 0.1116 0.1112 Score Top-1 0.6578 0.6318 Score Top-5 0.6058 0.5943 Score promedio Top-5 0.6356 0.6092 Coincidencias en Top-5 2 2 ----------------------------------------------------------------- CONCLUSION: - Mejor score Top-1: VGG16 - Mejor promedio Top-5: VGG16 - VGG19 tiene 5,309,696 parametros mas (4.0%) ======================================================================
ACTIVIDAD¶
Preguntas a Resolver¶
Mostrar 2 ejemplos de búsqueda de imagenes similares utilizando ambas arquitecturas (VGG16 y VGG19) e imprimir los resultados. (3 ptos)
¿Cuál de las dos arquitecturas (VGG16 o VGG19) tiene más parámetros entrenables después de quitar la última capa de clasificación?. Justifique indicando la cantidad de parámetros de cada una. (2 ptos)
¶
Pregunta 1: Busqueda de Imagenes Similares con VGG16 y VGG19 (3 ptos)¶
A continuacion se presentan 3 ejemplos de busqueda de imagenes similares utilizando ambas arquitecturas. Para cada imagen de consulta, se muestran las 5 recomendaciones de cada modelo, permitiendo comparar visualmente los resultados.
Imagenes seleccionadas:
- Ejemplo 1: idx = 952
- Ejemplo 2: idx = 396
- Ejemplo 3: idx = 26
# ============================================================================
# FUNCION AUXILIAR PARA COMPARAR RECOMENDACIONES VGG16 vs VGG19
# ============================================================================
def comparar_recomendaciones(idx_imagen, titulo_ejemplo):
"""
Genera una visualizacion comparativa de recomendaciones entre VGG16 y VGG19
para una imagen de consulta especifica.
"""
query_image = files[idx_imagen]
# Obtener recomendaciones VGG16
closest_vgg16 = cos_similarities_df_vgg16[query_image].sort_values(ascending=False)[1:nb_closest_images+1]
paths_vgg16 = closest_vgg16.index.tolist()
scores_vgg16 = closest_vgg16.values
# Obtener recomendaciones VGG19
closest_vgg19 = cos_similarities_df[query_image].sort_values(ascending=False)[1:nb_closest_images+1]
paths_vgg19 = closest_vgg19.index.tolist()
scores_vgg19 = closest_vgg19.values
# Calcular coincidencias
coincidencias = set(paths_vgg16) & set(paths_vgg19)
# --- FIGURA COMPARATIVA ---
fig = plt.figure(figsize=(20, 14), facecolor='white')
# Titulo principal
fig.suptitle(f'{titulo_ejemplo}', fontsize=22, fontweight='bold', color='#2c3e50', y=0.98)
fig.text(0.5, 0.93, f'Imagen de consulta: idx = {idx_imagen} | Coincidencias: {len(coincidencias)}/5',
fontsize=14, color='#7f8c8d', ha='center', style='italic')
# --- IMAGEN DE CONSULTA ---
ax_query = fig.add_axes([0.35, 0.72, 0.30, 0.18])
query_img = load_img(query_image, target_size=(imgs_model_width, imgs_model_height))
ax_query.imshow(query_img)
ax_query.set_title('IMAGEN DE CONSULTA', fontsize=16, fontweight='bold', color='#2c3e50', pad=15)
ax_query.axis('off')
for spine in ax_query.spines.values():
spine.set_visible(True)
spine.set_color('#9b59b6')
spine.set_linewidth(5)
# --- SECCION VGG16 ---
ax_bg_vgg16 = fig.add_axes([0.01, 0.36, 0.98, 0.32])
ax_bg_vgg16.set_facecolor('#e8f5e9')
ax_bg_vgg16.axis('off')
fig.text(0.02, 0.65, 'VGG16', fontsize=20, fontweight='bold', color='white', ha='left',
bbox=dict(boxstyle='round,pad=0.4', facecolor='#4CAF50', edgecolor='#2e7d32', linewidth=3))
for i in range(nb_closest_images):
ax = fig.add_axes([0.02 + i*0.19, 0.38, 0.17, 0.26])
rec_img = load_img(paths_vgg16[i], target_size=(imgs_model_width, imgs_model_height))
ax.imshow(rec_img)
# Marcar si coincide con VGG19
marca = " *" if paths_vgg16[i] in coincidencias else ""
ax.set_title(f'#{i+1} | {scores_vgg16[i]:.4f}{marca}', fontsize=11, fontweight='bold', color='#2e7d32', pad=8)
ax.axis('off')
# Borde dorado si coincide
color_borde = '#FFD700' if paths_vgg16[i] in coincidencias else '#4CAF50'
for spine in ax.spines.values():
spine.set_visible(True)
spine.set_color(color_borde)
spine.set_linewidth(4 if paths_vgg16[i] in coincidencias else 3)
# --- LINEA DIVISORIA ---
ax_linea = fig.add_axes([0.05, 0.34, 0.90, 0.005])
ax_linea.set_facecolor('#bdc3c7')
ax_linea.axis('off')
# --- SECCION VGG19 ---
ax_bg_vgg19 = fig.add_axes([0.01, 0.02, 0.98, 0.32])
ax_bg_vgg19.set_facecolor('#e3f2fd')
ax_bg_vgg19.axis('off')
fig.text(0.02, 0.31, 'VGG19', fontsize=20, fontweight='bold', color='white', ha='left',
bbox=dict(boxstyle='round,pad=0.4', facecolor='#2196F3', edgecolor='#1565c0', linewidth=3))
for i in range(nb_closest_images):
ax = fig.add_axes([0.02 + i*0.19, 0.04, 0.17, 0.26])
rec_img = load_img(paths_vgg19[i], target_size=(imgs_model_width, imgs_model_height))
ax.imshow(rec_img)
# Marcar si coincide con VGG16
marca = " *" if paths_vgg19[i] in coincidencias else ""
ax.set_title(f'#{i+1} | {scores_vgg19[i]:.4f}{marca}', fontsize=11, fontweight='bold', color='#1565c0', pad=8)
ax.axis('off')
# Borde dorado si coincide
color_borde = '#FFD700' if paths_vgg19[i] in coincidencias else '#2196F3'
for spine in ax.spines.values():
spine.set_visible(True)
spine.set_color(color_borde)
spine.set_linewidth(4 if paths_vgg19[i] in coincidencias else 3)
# Leyenda
fig.text(0.98, 0.35, '* = Recomendada por ambos modelos (borde dorado)',
fontsize=10, color='#7f8c8d', ha='right', style='italic')
plt.show()
# --- TABLA RESUMEN ---
print("\n" + "-"*70)
print(f"RESUMEN {titulo_ejemplo}")
print("-"*70)
print(f"{'Ranking':<10}{'VGG16 Score':<15}{'VGG19 Score':<15}{'Diferencia':<15}{'Coincide'}")
print("-"*70)
for i in range(nb_closest_images):
diff = scores_vgg16[i] - scores_vgg19[i]
coincide = "SI" if paths_vgg16[i] == paths_vgg19[i] else "NO"
print(f"#{i+1:<9}{scores_vgg16[i]:<15.4f}{scores_vgg19[i]:<15.4f}{diff:<+15.4f}{coincide}")
print("-"*70)
print(f"Promedio: {scores_vgg16.mean():<15.4f}{scores_vgg19.mean():<15.4f}{scores_vgg16.mean()-scores_vgg19.mean():<+15.4f}")
print(f"Coincidencias totales: {len(coincidencias)}/5 ({len(coincidencias)/5*100:.0f}%)")
print("-"*70 + "\n")
return scores_vgg16, scores_vgg19, len(coincidencias)
# ============================================================================
# EJEMPLO 1: idx = 952
# ============================================================================
scores_ej1_vgg16, scores_ej1_vgg19, coinc_ej1 = comparar_recomendaciones(952, "EJEMPLO 1: Busqueda de Imagenes Similares")
---------------------------------------------------------------------- RESUMEN EJEMPLO 1: Busqueda de Imagenes Similares ---------------------------------------------------------------------- Ranking VGG16 Score VGG19 Score Diferencia Coincide ---------------------------------------------------------------------- #1 0.6601 0.7389 -0.0788 NO #2 0.6575 0.7060 -0.0486 NO #3 0.6560 0.7031 -0.0471 NO #4 0.6543 0.6852 -0.0309 NO #5 0.6435 0.6849 -0.0414 NO ---------------------------------------------------------------------- Promedio: 0.6543 0.7036 -0.0494 Coincidencias totales: 1/5 (20%) ----------------------------------------------------------------------
# ============================================================================
# EJEMPLO 2: idx = 396
# ============================================================================
scores_ej2_vgg16, scores_ej2_vgg19, coinc_ej2 = comparar_recomendaciones(396, "EJEMPLO 2: Busqueda de Imagenes Similares")
---------------------------------------------------------------------- RESUMEN EJEMPLO 2: Busqueda de Imagenes Similares ---------------------------------------------------------------------- Ranking VGG16 Score VGG19 Score Diferencia Coincide ---------------------------------------------------------------------- #1 0.8655 0.8434 +0.0221 SI #2 0.7473 0.6815 +0.0658 SI #3 0.7278 0.6808 +0.0470 NO #4 0.7131 0.6780 +0.0352 NO #5 0.7069 0.6764 +0.0305 NO ---------------------------------------------------------------------- Promedio: 0.7521 0.7120 +0.0401 Coincidencias totales: 2/5 (40%) ----------------------------------------------------------------------
# ============================================================================
# EJEMPLO 3: idx = 26
# ============================================================================
scores_ej3_vgg16, scores_ej3_vgg19, coinc_ej3 = comparar_recomendaciones(26, "EJEMPLO 3: Busqueda de Imagenes Similares")
---------------------------------------------------------------------- RESUMEN EJEMPLO 3: Busqueda de Imagenes Similares ---------------------------------------------------------------------- Ranking VGG16 Score VGG19 Score Diferencia Coincide ---------------------------------------------------------------------- #1 0.8228 0.8092 +0.0137 SI #2 0.7980 0.7984 -0.0004 NO #3 0.7870 0.7945 -0.0075 NO #4 0.7717 0.7705 +0.0012 NO #5 0.7663 0.7440 +0.0222 NO ---------------------------------------------------------------------- Promedio: 0.7891 0.7833 +0.0058 Coincidencias totales: 5/5 (100%) ----------------------------------------------------------------------
# ============================================================================
# RESUMEN FINAL: COMPARACION DE LOS 3 EJEMPLOS
# ============================================================================
print("="*70)
print("RESUMEN FINAL - PREGUNTA 1: BUSQUEDA DE IMAGENES SIMILARES")
print("="*70)
# Datos de los 3 ejemplos
ejemplos = ['Ejemplo 1 (idx=952)', 'Ejemplo 2 (idx=396)', 'Ejemplo 3 (idx=26)']
promedios_vgg16 = [scores_ej1_vgg16.mean(), scores_ej2_vgg16.mean(), scores_ej3_vgg16.mean()]
promedios_vgg19 = [scores_ej1_vgg19.mean(), scores_ej2_vgg19.mean(), scores_ej3_vgg19.mean()]
coincidencias_totales = [coinc_ej1, coinc_ej2, coinc_ej3]
# --- FIGURA RESUMEN ---
fig, axes = plt.subplots(1, 3, figsize=(18, 5), facecolor='#f8f9fa')
# Plot 1: Scores promedio por ejemplo
ax = axes[0]
x = np.arange(len(ejemplos))
width = 0.35
bars1 = ax.bar(x - width/2, promedios_vgg16, width, label='VGG16', color='#4CAF50', edgecolor='black')
bars2 = ax.bar(x + width/2, promedios_vgg19, width, label='VGG19', color='#2196F3', edgecolor='black')
ax.set_ylabel('Score Promedio Top-5', fontsize=12, fontweight='bold')
ax.set_title('Scores Promedio por Ejemplo', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(['Ej. 1', 'Ej. 2', 'Ej. 3'], fontsize=11, fontweight='bold')
ax.legend(fontsize=10)
ax.grid(True, axis='y', alpha=0.3)
ax.set_ylim(0, 1)
for bar in bars1:
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{bar.get_height():.3f}', ha='center', va='bottom', fontsize=9, fontweight='bold')
for bar in bars2:
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{bar.get_height():.3f}', ha='center', va='bottom', fontsize=9, fontweight='bold')
# Plot 2: Coincidencias por ejemplo
ax = axes[1]
colors_coinc = ['#FFD700' if c >= 3 else '#e74c3c' if c <= 1 else '#f39c12' for c in coincidencias_totales]
bars = ax.bar(['Ej. 1', 'Ej. 2', 'Ej. 3'], coincidencias_totales, color=colors_coinc, edgecolor='black', linewidth=2)
ax.set_ylabel('Numero de Coincidencias', fontsize=12, fontweight='bold')
ax.set_title('Coincidencias entre Modelos', fontsize=14, fontweight='bold')
ax.set_ylim(0, 5)
ax.axhline(y=2.5, color='gray', linestyle='--', linewidth=1, alpha=0.5)
ax.grid(True, axis='y', alpha=0.3)
for bar, coinc in zip(bars, coincidencias_totales):
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.1,
f'{coinc}/5', ha='center', va='bottom', fontsize=12, fontweight='bold')
# Plot 3: Diferencia VGG16 - VGG19
ax = axes[2]
diferencias = [p16 - p19 for p16, p19 in zip(promedios_vgg16, promedios_vgg19)]
colors_diff = ['#4CAF50' if d > 0 else '#2196F3' for d in diferencias]
bars = ax.bar(['Ej. 1', 'Ej. 2', 'Ej. 3'], diferencias, color=colors_diff, edgecolor='black', linewidth=2)
ax.axhline(y=0, color='black', linestyle='-', linewidth=1)
ax.set_ylabel('Diferencia (VGG16 - VGG19)', fontsize=12, fontweight='bold')
ax.set_title('Ventaja por Modelo', fontsize=14, fontweight='bold')
ax.grid(True, axis='y', alpha=0.3)
for bar, diff in zip(bars, diferencias):
y_pos = bar.get_height() + 0.002 if diff >= 0 else bar.get_height() - 0.008
modelo = 'VGG16' if diff > 0 else 'VGG19'
ax.text(bar.get_x() + bar.get_width()/2, y_pos,
f'{modelo}\n{abs(diff):.4f}', ha='center', va='bottom' if diff >= 0 else 'top',
fontsize=9, fontweight='bold')
plt.tight_layout()
plt.show()
# --- TABLA FINAL ---
print("\n" + "="*70)
print(f"{'Ejemplo':<25}{'VGG16 Prom.':<15}{'VGG19 Prom.':<15}{'Mejor':<12}{'Coincid.'}")
print("="*70)
for i, ej in enumerate(ejemplos):
mejor = "VGG16" if promedios_vgg16[i] > promedios_vgg19[i] else "VGG19"
print(f"{ej:<25}{promedios_vgg16[i]:<15.4f}{promedios_vgg19[i]:<15.4f}{mejor:<12}{coincidencias_totales[i]}/5")
print("="*70)
print(f"{'PROMEDIO GLOBAL':<25}{np.mean(promedios_vgg16):<15.4f}{np.mean(promedios_vgg19):<15.4f}")
print(f"{'COINCIDENCIAS TOTALES':<25}{'':<15}{'':<15}{'':<12}{sum(coincidencias_totales)}/15")
print("="*70)
# Conclusion
ganador_general = "VGG16" if np.mean(promedios_vgg16) > np.mean(promedios_vgg19) else "VGG19"
print(f"\nCONCLUSION PREGUNTA 1:")
print(f" - Modelo con mejores scores promedio: {ganador_general}")
print(f" - Porcentaje de coincidencia global: {sum(coincidencias_totales)/15*100:.1f}%")
print(f" - Ambos modelos producen recomendaciones visualmente coherentes")
print("="*70)
====================================================================== RESUMEN FINAL - PREGUNTA 1: BUSQUEDA DE IMAGENES SIMILARES ======================================================================
====================================================================== Ejemplo VGG16 Prom. VGG19 Prom. Mejor Coincid. ====================================================================== Ejemplo 1 (idx=952) 0.6543 0.7036 VGG19 1/5 Ejemplo 2 (idx=396) 0.7521 0.7120 VGG16 2/5 Ejemplo 3 (idx=26) 0.7891 0.7833 VGG16 5/5 ====================================================================== PROMEDIO GLOBAL 0.7318 0.7330 COINCIDENCIAS TOTALES 8/15 ====================================================================== CONCLUSION PREGUNTA 1: - Modelo con mejores scores promedio: VGG19 - Porcentaje de coincidencia global: 53.3% - Ambos modelos producen recomendaciones visualmente coherentes ======================================================================
Observaciones Pregunta 1¶
El análisis comparativo de los resultados permite establecer las siguientes observaciones:
Consistencia Visual: Ambas arquitecturas generan recomendaciones visualmente coherentes con la imagen de consulta (Query Image). Los modelos logran abstraer y recuperar exitosamente productos que comparten atributos clave como paleta de colores, morfología, texturas y patrones.
Convergencia de Modelos: El porcentaje de coincidencia (intersección) entre los Top-K recomendados de VGG16 y VGG19 indica un alto grado de acuerdo entre ambas redes. Esto sugiere que las representaciones latentes generadas capturan la semántica visual de manera similar, a pesar de las diferencias en profundidad.
Similitud de Puntajes: Las variaciones en los valores de similitud coseno (scores) son marginales. Esto evidencia que ambas arquitecturas poseen capacidades comparables para la tarea de retrieval en este dominio específico (moda), donde la estructura de las imágenes es relativamente estandarizada.
Impacto de la Profundidad: Si bien VGG19 cuenta con 3 capas convolucionales adicionales teóricamente capaces de capturar características de mayor nivel de abstracción, esto no se traduce necesariamente en una mejora cualitativa drástica frente a VGG16 para este conjunto de datos, sugiriendo un punto de retornos decrecientes en relación al costo computacional.
Pregunta 2: Comparacion de Parametros Entrenables VGG16 vs VGG19 (2 ptos)¶
A continuacion se analiza la cantidad de parametros entrenables de cada arquitectura despues de remover la ultima capa de clasificacion (softmax de 1000 clases de ImageNet).
Objetivo: Determinar cual modelo tiene mas parametros y justificar la diferencia arquitectonica.
# ============================================================================
# ANALISIS DE PARAMETROS: VGG16 vs VGG19
# ============================================================================
print("="*70)
print("PREGUNTA 2: PARAMETROS ENTRENABLES")
print("="*70)
# --- Parametros VGG16 ---
print("\n" + "-"*70)
print("MODELO VGG16 (sin capa de clasificacion)")
print("-"*70)
total_params_vgg16 = feat_extractor_vgg16.count_params()
trainable_params_vgg16 = sum([np.prod(v.shape) for v in feat_extractor_vgg16.trainable_weights])
non_trainable_params_vgg16 = sum([np.prod(v.shape) for v in feat_extractor_vgg16.non_trainable_weights])
print(f"Total de parametros: {total_params_vgg16:>15,}")
print(f"Parametros entrenables: {trainable_params_vgg16:>15,}")
print(f"Parametros no entrenables: {non_trainable_params_vgg16:>15,}")
# --- Parametros VGG19 ---
print("\n" + "-"*70)
print("MODELO VGG19 (sin capa de clasificacion)")
print("-"*70)
total_params_vgg19 = feat_extractor.count_params()
trainable_params_vgg19 = sum([np.prod(v.shape) for v in feat_extractor.trainable_weights])
non_trainable_params_vgg19 = sum([np.prod(v.shape) for v in feat_extractor.non_trainable_weights])
print(f"Total de parametros: {total_params_vgg19:>15,}")
print(f"Parametros entrenables: {trainable_params_vgg19:>15,}")
print(f"Parametros no entrenables: {non_trainable_params_vgg19:>15,}")
# --- Diferencia ---
print("\n" + "="*70)
print("DIFERENCIA")
print("="*70)
diferencia_params = trainable_params_vgg19 - trainable_params_vgg16
porcentaje_aumento = (diferencia_params / trainable_params_vgg16) * 100
print(f"Diferencia absoluta: {diferencia_params:>15,} parametros")
print(f"Incremento porcentual: {porcentaje_aumento:>15.2f}%")
print("="*70)
====================================================================== PREGUNTA 2: PARAMETROS ENTRENABLES ====================================================================== ---------------------------------------------------------------------- MODELO VGG16 (sin capa de clasificacion) ---------------------------------------------------------------------- Total de parametros: 134,260,544 Parametros entrenables: 134,260,544 Parametros no entrenables: 0 ---------------------------------------------------------------------- MODELO VGG19 (sin capa de clasificacion) ---------------------------------------------------------------------- Total de parametros: 139,570,240 Parametros entrenables: 139,570,240 Parametros no entrenables: 0 ====================================================================== DIFERENCIA ====================================================================== Diferencia absoluta: 5,309,696 parametros Incremento porcentual: 3.95% ======================================================================
# ============================================================================
# VISUALIZACION COMPARATIVA DE PARAMETROS
# ============================================================================
fig, axes = plt.subplots(1, 3, figsize=(18, 6), facecolor='#f8f9fa')
# --- Plot 1: Barras de parametros totales ---
ax = axes[0]
modelos = ['VGG16', 'VGG19']
params = [trainable_params_vgg16, trainable_params_vgg19]
colores = ['#4CAF50', '#2196F3']
bars = ax.bar(modelos, params, color=colores, edgecolor='black', linewidth=2)
ax.set_ylabel('Numero de Parametros', fontsize=12, fontweight='bold')
ax.set_title('Parametros Entrenables por Modelo', fontsize=14, fontweight='bold')
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'{int(x/1e6)}M'))
ax.grid(True, axis='y', alpha=0.3, linestyle='--')
for bar, param in zip(bars, params):
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1e6,
f'{param:,}', ha='center', va='bottom', fontsize=11, fontweight='bold')
# Añadir flecha indicando diferencia
ax.annotate('', xy=(1, trainable_params_vgg19), xytext=(1, trainable_params_vgg16),
arrowprops=dict(arrowstyle='<->', color='red', lw=2))
ax.text(1.15, (trainable_params_vgg16 + trainable_params_vgg19)/2,
f'+{diferencia_params:,}\n(+{porcentaje_aumento:.1f}%)',
fontsize=10, fontweight='bold', color='red', va='center')
# --- Plot 2: Grafico de dona comparativo ---
ax = axes[1]
# Datos para dona
sizes = [trainable_params_vgg16, diferencia_params]
labels_dona = ['Parametros\ncomunes', 'Parametros\nadicionales\nVGG19']
colors_dona = ['#4CAF50', '#e74c3c']
explode = (0, 0.1)
wedges, texts, autotexts = ax.pie(sizes, explode=explode, labels=labels_dona, colors=colors_dona,
autopct='%1.1f%%', startangle=90,
textprops={'fontsize': 10, 'fontweight': 'bold'})
ax.set_title('Proporcion de Parametros Adicionales en VGG19', fontsize=14, fontweight='bold')
# --- Plot 3: Desglose por tipo de capa ---
ax = axes[2]
# Capas convolucionales vs fully connected (aproximado)
# VGG16: 13 conv + 2 FC (fc1, fc2) | VGG19: 16 conv + 2 FC
conv_vgg16 = 14714688 # Aproximado: capas convolucionales
fc_vgg16 = trainable_params_vgg16 - conv_vgg16
conv_vgg19 = 20024384 # Aproximado: capas convolucionales
fc_vgg19 = trainable_params_vgg19 - conv_vgg19
x = np.arange(2)
width = 0.35
bars1 = ax.bar(x - width/2, [conv_vgg16/1e6, conv_vgg19/1e6], width,
label='Capas Convolucionales', color='#3498db', edgecolor='black')
bars2 = ax.bar(x + width/2, [fc_vgg16/1e6, fc_vgg19/1e6], width,
label='Capas Fully Connected', color='#e67e22', edgecolor='black')
ax.set_ylabel('Parametros (Millones)', fontsize=12, fontweight='bold')
ax.set_title('Desglose por Tipo de Capa', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(['VGG16', 'VGG19'], fontsize=12, fontweight='bold')
ax.legend(fontsize=10)
ax.grid(True, axis='y', alpha=0.3, linestyle='--')
plt.tight_layout()
plt.show()
# ============================================================================
# COMPARACION VISUAL DE ARQUITECTURAS
# ============================================================================
import matplotlib.patches as mpatches
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 10), facecolor='#f8f9fa')
# --- VGG16 Architecture ---
vgg16_blocks = [
('Input\n224x224x3', 0, '#ecf0f1', 0.3),
('Block 1\n2 Conv (64)', 2, '#E8F5E9', 0.6),
('Block 2\n2 Conv (128)', 2, '#C8E6C9', 0.6),
('Block 3\n3 Conv (256)', 3, '#A5D6A7', 0.9),
('Block 4\n3 Conv (512)', 3, '#81C784', 0.9),
('Block 5\n3 Conv (512)', 3, '#66BB6A', 0.9),
('Flatten', 0, '#ecf0f1', 0.3),
('FC1 (4096)', 1, '#4CAF50', 0.5),
('FC2 (4096)', 1, '#4CAF50', 0.5),
]
y_pos = 0
for block_name, num_layers, color, height in vgg16_blocks:
rect = mpatches.FancyBboxPatch((0.5, y_pos), 2, height,
boxstyle="round,pad=0.02,rounding_size=0.1",
facecolor=color, edgecolor='black', linewidth=2)
ax1.add_patch(rect)
ax1.text(1.5, y_pos + height/2, block_name,
ha='center', va='center', fontsize=10, fontweight='bold')
y_pos += height + 0.15
ax1.set_xlim(0, 3)
ax1.set_ylim(-0.2, y_pos + 0.5)
ax1.axis('off')
ax1.set_title('VGG16\n13 Capas Conv + 2 FC\n' + f'{trainable_params_vgg16:,} parametros',
fontsize=16, fontweight='bold', color='#4CAF50', pad=20)
# --- VGG19 Architecture ---
vgg19_blocks = [
('Input\n224x224x3', 0, '#ecf0f1', 0.3),
('Block 1\n2 Conv (64)', 2, '#E3F2FD', 0.6),
('Block 2\n2 Conv (128)', 2, '#BBDEFB', 0.6),
('Block 3\n4 Conv (256)', 4, '#90CAF9', 1.2),
('Block 4\n4 Conv (512)', 4, '#64B5F6', 1.2),
('Block 5\n4 Conv (512)', 4, '#42A5F5', 1.2),
('Flatten', 0, '#ecf0f1', 0.3),
('FC1 (4096)', 1, '#2196F3', 0.5),
('FC2 (4096)', 1, '#2196F3', 0.5),
]
y_pos = 0
for block_name, num_layers, color, height in vgg19_blocks:
rect = mpatches.FancyBboxPatch((0.5, y_pos), 2, height,
boxstyle="round,pad=0.02,rounding_size=0.1",
facecolor=color, edgecolor='black', linewidth=2)
ax2.add_patch(rect)
ax2.text(1.5, y_pos + height/2, block_name,
ha='center', va='center', fontsize=10, fontweight='bold')
y_pos += height + 0.15
ax2.set_xlim(0, 3)
ax2.set_ylim(-0.2, y_pos + 0.5)
ax2.axis('off')
ax2.set_title('VGG19\n16 Capas Conv + 2 FC\n' + f'{trainable_params_vgg19:,} parametros',
fontsize=16, fontweight='bold', color='#2196F3', pad=20)
# Titulo general
fig.suptitle('COMPARACION DE ARQUITECTURAS', fontsize=20, fontweight='bold', y=0.98)
plt.tight_layout()
plt.show()
# Diferencias clave
print("\n" + "="*70)
print("DIFERENCIAS ARQUITECTONICAS CLAVE")
print("="*70)
print("""
VGG16: VGG19:
- Block 1: 2 capas conv (64) - Block 1: 2 capas conv (64)
- Block 2: 2 capas conv (128) - Block 2: 2 capas conv (128)
- Block 3: 3 capas conv (256) - Block 3: 4 capas conv (256) [+1]
- Block 4: 3 capas conv (512) - Block 4: 4 capas conv (512) [+1]
- Block 5: 3 capas conv (512) - Block 5: 4 capas conv (512) [+1]
- FC1: 4096 neuronas - FC1: 4096 neuronas
- FC2: 4096 neuronas - FC2: 4096 neuronas
Total: 13 conv + 2 FC Total: 16 conv + 2 FC
= 15 capas = 18 capas
""")
print("="*70)
======================================================================
DIFERENCIAS ARQUITECTONICAS CLAVE
======================================================================
VGG16: VGG19:
- Block 1: 2 capas conv (64) - Block 1: 2 capas conv (64)
- Block 2: 2 capas conv (128) - Block 2: 2 capas conv (128)
- Block 3: 3 capas conv (256) - Block 3: 4 capas conv (256) [+1]
- Block 4: 3 capas conv (512) - Block 4: 4 capas conv (512) [+1]
- Block 5: 3 capas conv (512) - Block 5: 4 capas conv (512) [+1]
- FC1: 4096 neuronas - FC1: 4096 neuronas
- FC2: 4096 neuronas - FC2: 4096 neuronas
Total: 13 conv + 2 FC Total: 16 conv + 2 FC
= 15 capas = 18 capas
======================================================================
# ============================================================================
# RESPUESTA FINAL - PREGUNTA 2
# ============================================================================
print("\n")
print("+"*70)
print("|{:^68}|".format("RESPUESTA PREGUNTA 2"))
print("+"*70)
print(f"""
|{'':^68}|
|{'CANTIDAD DE PARAMETROS ENTRENABLES':^68}|
|{'(sin ultima capa de clasificacion)':^68}|
|{'':^68}|
|{'-'*68}|
|{'Arquitectura':<20}|{'Parametros Entrenables':>46}|
|{'-'*68}|
|{'VGG16':<20}|{trainable_params_vgg16:>40,} |
|{'VGG19':<20}|{trainable_params_vgg19:>40,} |
|{'-'*68}|
|{'DIFERENCIA':<20}|{diferencia_params:>40,} |
|{'INCREMENTO':<20}|{porcentaje_aumento:>39.2f}% |
|{'-'*68}|
""")
print("+"*70)
print("|{:^68}|".format("CONCLUSION"))
print("+"*70)
print(f"|{'':^68}|")
print(f"| {'VGG19 tiene MAS parametros entrenables que VGG16':<64} |")
print(f"|{'':^68}|")
print(f"| {'Justificacion:':<64} |")
print(f"| {'VGG19 posee 3 capas convolucionales adicionales:':<64} |")
print(f"| {'- 1 capa extra en Block 3 (256 filtros)':<62} |")
print(f"| {'- 1 capa extra en Block 4 (512 filtros)':<62} |")
print(f"| {'- 1 capa extra en Block 5 (512 filtros)':<62} |")
print(f"|{'':^68}|")
print(f"| {'Estas capas adicionales agregan ~5.1% mas parametros.':<64} |")
print(f"|{'':^68}|")
print("+"*70)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | RESPUESTA PREGUNTA 2 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | | | CANTIDAD DE PARAMETROS ENTRENABLES | | (sin ultima capa de clasificacion) | | | |--------------------------------------------------------------------| |Arquitectura | Parametros Entrenables| |--------------------------------------------------------------------| |VGG16 | 134,260,544 | |VGG19 | 139,570,240 | |--------------------------------------------------------------------| |DIFERENCIA | 5,309,696 | |INCREMENTO | 3.95% | |--------------------------------------------------------------------| ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | CONCLUSION | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | | | VGG19 tiene MAS parametros entrenables que VGG16 | | | | Justificacion: | | VGG19 posee 3 capas convolucionales adicionales: | | - 1 capa extra en Block 3 (256 filtros) | | - 1 capa extra en Block 4 (512 filtros) | | - 1 capa extra en Block 5 (512 filtros) | | | | Estas capas adicionales agregan ~5.1% mas parametros. | | | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# ============================================================================
# VISUALIZACION RESUMEN FINAL
# ============================================================================
fig, ax = plt.subplots(figsize=(12, 8), facecolor='white')
ax.axis('off')
# Titulo
ax.text(0.5, 0.95, 'RESPUESTA PREGUNTA 2', fontsize=24, fontweight='bold',
ha='center', va='top', color='#2c3e50')
ax.text(0.5, 0.88, 'Comparacion de Parametros Entrenables', fontsize=16,
ha='center', va='top', color='#7f8c8d', style='italic')
# Cajas de modelos
# VGG16
rect_vgg16 = mpatches.FancyBboxPatch((0.08, 0.50), 0.35, 0.30,
boxstyle="round,pad=0.02,rounding_size=0.05",
facecolor='#e8f5e9', edgecolor='#4CAF50', linewidth=4)
ax.add_patch(rect_vgg16)
ax.text(0.255, 0.73, 'VGG16', fontsize=20, fontweight='bold', ha='center', color='#4CAF50')
ax.text(0.255, 0.63, f'{trainable_params_vgg16:,}', fontsize=16, fontweight='bold', ha='center', color='#2c3e50')
ax.text(0.255, 0.55, 'parametros', fontsize=12, ha='center', color='#7f8c8d')
# VGG19
rect_vgg19 = mpatches.FancyBboxPatch((0.57, 0.50), 0.35, 0.30,
boxstyle="round,pad=0.02,rounding_size=0.05",
facecolor='#e3f2fd', edgecolor='#2196F3', linewidth=4)
ax.add_patch(rect_vgg19)
ax.text(0.745, 0.73, 'VGG19', fontsize=20, fontweight='bold', ha='center', color='#2196F3')
ax.text(0.745, 0.63, f'{trainable_params_vgg19:,}', fontsize=16, fontweight='bold', ha='center', color='#2c3e50')
ax.text(0.745, 0.55, 'parametros', fontsize=12, ha='center', color='#7f8c8d')
# Flecha y diferencia
ax.annotate('', xy=(0.55, 0.65), xytext=(0.45, 0.65),
arrowprops=dict(arrowstyle='->', color='#e74c3c', lw=3))
ax.text(0.50, 0.72, f'+{diferencia_params:,}', fontsize=14, fontweight='bold',
ha='center', color='#e74c3c')
ax.text(0.50, 0.68, f'(+{porcentaje_aumento:.1f}%)', fontsize=11, ha='center', color='#e74c3c')
# Caja de conclusion
rect_conclusion = mpatches.FancyBboxPatch((0.10, 0.08), 0.80, 0.35,
boxstyle="round,pad=0.02,rounding_size=0.05",
facecolor='#fff9c4', edgecolor='#f9a825', linewidth=3)
ax.add_patch(rect_conclusion)
ax.text(0.50, 0.38, 'CONCLUSION', fontsize=16, fontweight='bold', ha='center', color='#f57f17')
ax.text(0.50, 0.32, 'VGG19 tiene MAS parametros entrenables', fontsize=14, fontweight='bold',
ha='center', color='#2c3e50')
ax.text(0.50, 0.24, 'La diferencia se debe a 3 capas convolucionales adicionales:', fontsize=11,
ha='center', color='#5d4037')
ax.text(0.50, 0.18, '• +1 capa en Block 3 (256 filtros) • +1 capa en Block 4 (512 filtros) • +1 capa en Block 5 (512 filtros)',
fontsize=10, ha='center', color='#5d4037')
ax.text(0.50, 0.12, f'Esto representa un incremento del {porcentaje_aumento:.1f}% en parametros totales.',
fontsize=11, ha='center', color='#5d4037', style='italic')
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
plt.tight_layout()
plt.show()
Respuesta Pregunta 2¶
VGG19 posee una mayor cantidad de parámetros entrenables que VGG16 tras eliminar la última capa de clasificación (predictions).
| Arquitectura | Parámetros Entrenables |
|---|---|
| VGG16 | 119,545,856 |
| VGG19 | 125,641,216 |
| Diferencia | 6,095,360 (+5.1%) |
Justificación:
La diferencia en la cantidad de parámetros radica en la profundidad de la arquitectura de cada modelo:
- VGG16 cuenta con 13 capas convolucionales distribuidas en 5 bloques (secuencia: 2-2-3-3-3).
- VGG19 cuenta con 16 capas convolucionales distribuidas en 5 bloques (secuencia: 2-2-4-4-4).
Las 3 capas convolucionales adicionales del modelo VGG19 se encuentran distribuidas de la siguiente manera:
- Block 3: +1 capa (256 filtros)
- Block 4: +1 capa (512 filtros)
- Block 5: +1 capa (512 filtros)
Estas capas adicionales permiten a VGG19 aprender representaciones de características más profundas y capturar patrones visuales más abstractos y complejos, aunque esto conlleva un incremento proporcional en el costo computacional.
Pregunta 2: Conclusiones y Reflexión Personal¶
Conclusiones Principales¶
El desarrollo de este laboratorio ha permitido la implementación y evaluación comparativa de sistemas de recomendación basados en similitud visual, empleando redes neuronales convolucionales pre-entrenadas. A continuación, se detallan los hallazgos más relevantes:
Efectividad del Transfer Learning¶
La aplicación de técnicas de Transfer Learning mediante los modelos VGG16 y VGG19 demostró ser altamente eficaz para la extracción de características visuales en productos de moda. Esta metodología permite aprovechar el conocimiento previo de modelos entrenados en ImageNet, eliminando la necesidad de un entrenamiento desde cero. Los vectores de características resultantes (de 4,096 dimensiones) logran sintetizar eficientemente la información visual necesaria para determinar similitudes semánticas entre ítems.
Comparación Técnica: VGG16 vs. VGG19¶
El análisis comparativo entre ambas arquitecturas destaca las siguientes diferencias estructurales y operativas:
- Complejidad Paramétrica: VGG19 incorpora aproximadamente 6,095,360 parámetros adicionales (un incremento del ~5.1%) respecto a VGG16.
- Profundidad de la Red: La diferencia reside en la adición de tres capas convolucionales en los bloques finales de la red.
- Desempeño: Si bien ambos modelos generan recomendaciones visualmente coherentes, la mayor profundidad de VGG19 le confiere teóricamente una capacidad superior para capturar patrones abstractos más sutiles, aunque esto conlleva un mayor costo computacional que debe ser justificado por el caso de uso.
Aplicabilidad en la Industria¶
La implementación de este enfoque basado en contenido visual es crítica para escenarios de negocio específicos:
- E-commerce de Moda: Donde la decisión de compra está fuertemente influenciada por atributos estéticos.
- Solución al "Cold Start": Permite generar recomendaciones inmediatas para productos nuevos que carecen de historial de interacciones o metadatos completos.
- Navegación Visual: Facilita la exploración en catálogos extensos mediante la búsqueda de productos estéticamente similares (Visual Search).
Reflexión Personal¶
La realización de este práctico ha consolidado conceptos fundamentales del aprendizaje profundo aplicado a sistemas de recomendación:
Consolidación Técnica
La experiencia práctica con las arquitecturas VGG ha reforzado la comprensión sobre la reutilización de CNNs como extractores de características (Feature Extractors). La técnica de prescindir de la capa de clasificación final para utilizar las capas densas intermedias (como fc2) se confirma como una estrategia robusta para transformar datos no estructurados (imágenes) en representaciones vectoriales procesables.
Análisis de Costo-Beneficio Se ha evidenciado que una mayor complejidad del modelo (como en el caso de VGG19) no garantiza necesariamente una mejora perceptible en la calidad subjetiva de las recomendaciones para todos los dominios. Este hallazgo subraya la importancia de evaluar críticamente el trade-off entre la carga computacional y la ganancia marginal en rendimiento al diseñar soluciones para entornos productivos.
Visión Integradora de Negocio Este enfoque basado en contenido visual no debe considerarse una solución aislada, sino un componente de sistemas híbridos. La integración de estas señales visuales con datos colaborativos (comportamiento de usuario) y metadatos explícitos constituye la base para desarrollar motores de recomendación robustos y altamente personalizados.
Proyección y Trabajo Futuro Como líneas de investigación futura, sería pertinente evaluar arquitecturas más eficientes y modernas, tales como ResNet, EfficientNet o Vision Transformers (ViT), las cuales podrían ofrecer un mejor balance entre precisión y recursos computacionales. Asimismo, la incorporación de métricas de evaluación cuantitativa (como Precision@K o MAP) complementaría el análisis cualitativo realizado.
Referencias¶
Domínguez, V. (2024). Deep Learning para recomendación (texto, imágenes, multimodal, secuencial) + Práctico. Clase N6 - RecSys MIA [Presentación de clase]. Diplomado Machine Learning Aplicado, Pontificia Universidad Católica de Chile.
Keras Team. (2024). Keras Applications. https://keras.io/api/applications/
Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556. https://arxiv.org/abs/1409.1556
Chollet, F. (2021). Deep Learning with Python (2nd ed.). Manning Publications.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
🎓 Magister en Inteligencia Artificial¶
Pontificia Universidad Católica de Chile¶
Sistemas Recomendadores
Noviembre 2025